On Dienstag, 19. April 2016 13:25:53 CEST Moritz Barsnick wrote: > Just some random observations from me. > > > BTW: > > ./ffmpeg -i ~/test2.mkv -vf signature=xml:filename=signature.xml -f null - > > fails because xml is not a valid parameter. > > ./ffmpeg -i ~/test2.mkv -vf > > signature=xml:filename=signature.xml:detectmode=full -f null - > > fails not, but write binary to the output file. Why? > > I think "format" is not the first parameter the filter accepts, and > you're passing "xml" as an unnamed parameter. (Not sure why it's > accepted in the second case though.) signature=format=xml:... > > > +@item detectmode > > +Enable or disable the matching process. > > +or 1 (to enable). Optionally you can set the mode to 2. Then the detection > > ends, > > There's a line or sentence missing. Furthermore, it's unclear which > symbolic values (described below) those numbers relate to. The whole sentence is crap (was meaningful with former revisions). Have forgotten to delete it.
> > > +Set the number of inputs. The option value must be a non negative interger. > ^^^^^^^^ > integer > > > +Set the path to witch the output is written. If there is more than one > > input, > ^^^^^ which > > > +To calculate the signature of an input video and store it in signature.bin: > > +@example > > +ffmpeg -i input.mkv -vf signature=filename=signature.bin -map 0:v -c > > rawvideo -f null - > > +@end example > > + > > +@item > > +To detect whether two videos matches and store the signatures in XML > > format in > > +signature0.xml and signature1.xml: > > +@example > > +ffmpeg -i input1.mkv -i input2.mkv -filter_complex "[0:0][1:0] > > signature=nb_inputs=2:detectmode=full:format=xml:filename=signature%d.xml" > > -map 0:v -map 1:v -c rawvideo -f null - > > When muxing to null, you shouldn't need to specify the rawvideo codec. > It doesn't affect your filter anyway, and ffmpeg recently defaults to > wrapped_avframe, which is faster. I think you don't even need to "-map > 0:v -map 1:v", the complex filter gets its "[0:0][1:0]" inputs anyway. > Any if you need both, you should conistently stick to ":0" or ":v" > stream specifiers, because you mean the exact same one in both cases. I think with map it should be faster because the framework does not decode the audio and the subtitles. Don't know abount :v, have updated it. > > > > + * calculates the jaccard distance and evaluate a pair of course > > signatures as good > ^^^^^^^^ evaluates > > > + * compares framesignatures and sort out signatures with a l1 distance > > over a given threshold. > ^^^^ sorts > ^^^^ above > > > > + /* The following calculate a summed area table (intpic) and brings the > > numbers > ^calculates > > + * in intpic to to the same denuminator. > ^^^^^ to ^^^^^^^^^^^ denominator > > + * So you only have to handle the numinator in the following sections. > ^^^^^^^^^ numerator > > + dh1 = inlink->h/32; > > + if (inlink->h%32) > > Shouldn't there be whitespace around the operators? > > > + if (!f) { > > + av_log(ctx, AV_LOG_ERROR, "cannot open xml file %s\n", filename); > > How about strerror(errno) additionally? I have added it (taken from the psnr filter mainly). > > > + if (!f) { > > + av_log(ctx, AV_LOG_ERROR, "cannot open file %s\n", filename); > > Ditto. > > > + ret = ff_request_frame(ctx->inputs[i]); > > + // TODO handle this in a better way? > > + // Problem is the following: > > + // Assuming two inputs, inputA with 50 frames, inputB with 100 > > frames > > + // simply returning ret when < 0 would result in not filtering > > inputB > > + // after 50 frames anymore, not wanted > > + // only returning ret at the end would result in only respecting > > the error > > + // values of the last input, please comment > > Dies this still need to be resolved? Yes. > > > + if(match.score != 0){ > Whitespace style nit: > if (match.score != 0) { > Same everywhere else. You have quite inconsistent use of whitespace for > "if"s, "while"s and '{'s. Thank you for the typos and the whitespace errors. They should be fixed now. Gerion
diff --git a/doc/filters.texi b/doc/filters.texi index 9c5930a..9884e1b 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -11822,9 +11822,6 @@ It accepts the following options: @table @option @item detectmode Enable or disable the matching process. -or 1 (to enable). Optionally you can set the mode to 2. Then the detection ends, -if the first matching sequence it reached. This should be slightly faster. -Per default the detection is disabled. Available values are: @@ -11840,11 +11837,11 @@ some cases. @end table @item nb_inputs -Set the number of inputs. The option value must be a non negative interger. +Set the number of inputs. The option value must be a non negative integer. Default value is 1. @item filename -Set the path to witch the output is written. If there is more than one input, +Set the path to which the output is written. If there is more than one input, the path must be a prototype, i.e. must contain %d or %0nd (where n is a positive integer), that will be replaced with the input number. If no filename is specified, no output will be written. This is the default. @@ -11889,14 +11886,14 @@ The option value must be a double value between 0 and 1. The default value is 0. @item To calculate the signature of an input video and store it in signature.bin: @example -ffmpeg -i input.mkv -vf signature=filename=signature.bin -map 0:v -c rawvideo -f null - +ffmpeg -i input.mkv -vf signature=filename=signature.bin -map 0:v -f null - @end example @item To detect whether two videos matches and store the signatures in XML format in signature0.xml and signature1.xml: @example -ffmpeg -i input1.mkv -i input2.mkv -filter_complex "[0:0][1:0] signature=nb_inputs=2:detectmode=full:format=xml:filename=signature%d.xml" -map 0:v -map 1:v -c rawvideo -f null - +ffmpeg -i input1.mkv -i input2.mkv -filter_complex "[0:v][1:v] signature=nb_inputs=2:detectmode=full:format=xml:filename=signature%d.xml" -map :v -f null - @end example @end itemize diff --git a/libavfilter/signature_lookup.c b/libavfilter/signature_lookup.c index b431073..bf55ecd 100644 --- a/libavfilter/signature_lookup.c +++ b/libavfilter/signature_lookup.c @@ -42,8 +42,8 @@ static void fill_l1distlut(uint8_t lut[]) int i,j,tmp_i,tmp_j,count; uint8_t dist; - for(i=0, count=0; i<242; i++){ - for(j=i+1; j<243; j++, count++){ + for (i = 0, count = 0; i < 242; i++) { + for (j = i + 1; j < 243; j++, count++) { /* ternary distance between i and j */ dist = 0; tmp_i = i; tmp_j = j; @@ -60,7 +60,7 @@ static void fill_l1distlut(uint8_t lut[]) static unsigned int intersection_word(const uint8_t *first, const uint8_t *second) { unsigned int val=0,i; - for(i=0; i < 28; i+=4){ + for (i = 0; i < 28; i += 4) { val += av_popcount( (first[i] & second[i] ) << 24 | (first[i+1] & second[i+1]) << 16 | (first[i+2] & second[i+2]) << 8 | @@ -75,7 +75,7 @@ static unsigned int intersection_word(const uint8_t *first, const uint8_t *secon static unsigned int union_word(const uint8_t *first, const uint8_t *second) { unsigned int val=0,i; - for(i=0; i < 28; i+=4){ + for (i = 0; i < 28; i += 4) { val += av_popcount( (first[i] | second[i] ) << 24 | (first[i+1] | second[i+1]) << 16 | (first[i+2] | second[i+2]) << 8 | @@ -93,11 +93,11 @@ static unsigned int get_l1dist(AVFilterContext *ctx, SignatureContext *sc, const unsigned int dist = 0; uint8_t f,s; - for(i=0; i < SIGELEM_SIZE/5; i++){ - if(first[i] != second[i]){ + for (i = 0; i < SIGELEM_SIZE/5; i++) { + if (first[i] != second[i]) { f = first[i]; s = second[i]; - if (f > s){ + if (f > s) { /* little variation of gauss sum formula */ dist += sc->l1distlut[243*242/2 - (243-s)*(242-s)/2 + f - s - 1]; } else { @@ -109,24 +109,24 @@ static unsigned int get_l1dist(AVFilterContext *ctx, SignatureContext *sc, const } /** - * calculates the jaccard distance and evaluate a pair of course signatures as good + * calculates the jaccard distance and evaluates a pair of course signatures as good * @return 0 if pair is bad, 1 otherwise */ static int get_jaccarddist(SignatureContext *sc, CourseSignature *first, CourseSignature *second) { int jaccarddist, i, composdist = 0, cwthcount = 0; - for(i=0; i<5; i++){ - if((jaccarddist = intersection_word(first->data[i], second->data[i])) > 0){ + for (i = 0; i < 5; i++) { + if ((jaccarddist = intersection_word(first->data[i], second->data[i])) > 0) { jaccarddist /= union_word(first->data[i], second->data[i]); } - if(jaccarddist >= sc->thworddist){ - if(++cwthcount > 2){ + if (jaccarddist >= sc->thworddist) { + if (++cwthcount > 2) { /* more than half (5/2) of distances are too wide */ return 0; } } composdist += jaccarddist; - if(composdist > sc->thcomposdist){ + if (composdist > sc->thcomposdist) { return 0; } } @@ -140,10 +140,10 @@ static int get_jaccarddist(SignatureContext *sc, CourseSignature *first, CourseS static int find_next_coursecandidate(SignatureContext *sc, CourseSignature *secondstart, CourseSignature **first, CourseSignature **second, int start) { /* go one coursesignature foreword */ - if(!start){ - if((*second)->next){ + if (!start) { + if ((*second)->next) { *second = (*second)->next; - } else if((*first)->next){ + } else if ((*first)->next) { *second = secondstart; *first = (*first)->next; } else { @@ -151,14 +151,14 @@ static int find_next_coursecandidate(SignatureContext *sc, CourseSignature *seco } } - while(1){ - if(get_jaccarddist(sc, *first, *second)) + while (1) { + if (get_jaccarddist(sc, *first, *second)) return 1; /* next signature */ - if((*second)->next){ + if ((*second)->next) { *second = (*second)->next; - } else if((*first)->next){ + } else if ((*first)->next) { *second = secondstart; *first = (*first)->next; } else { @@ -168,7 +168,7 @@ static int find_next_coursecandidate(SignatureContext *sc, CourseSignature *seco } /** - * compares framesignatures and sort out signatures with a l1 distance over a given threshold. + * compares framesignatures and sorts out signatures with a l1 distance above a given threshold. * Then tries to find out offset and differences between framerates with a hough transformation */ static MatchingInfo* get_matching_parameters(AVFilterContext *ctx, SignatureContext *sc, FineSignature *first, FineSignature *second) @@ -198,29 +198,29 @@ static MatchingInfo* get_matching_parameters(AVFilterContext *ctx, SignatureCont hspace_elem** hspace = av_malloc_array(MAX_FRAMERATE, sizeof(hspace_elem *)); /* initialize houghspace */ - for(i = 0; i < MAX_FRAMERATE; i++){ + for (i = 0; i < MAX_FRAMERATE; i++) { hspace[i] = av_malloc_array(2 * HOUGH_MAX_OFFSET + 1, sizeof(hspace_elem)); - for(j = 0; j < HOUGH_MAX_OFFSET; j++){ + for (j = 0; j < HOUGH_MAX_OFFSET; j++) { hspace[i][j].score = 0; hspace[i][j].dist = 99999; } } /* l1 distances */ - for(i = 0, f = first; i < COURSE_SIZE && f->next; i++, f = f->next){ + for (i = 0, f = first; i < COURSE_SIZE && f->next; i++, f = f->next) { pairs[i].size = 0; pairs[i].dist = 99999; pairs[i].a = f; - for(j = 0, s = second; j < COURSE_SIZE && s->next; j++, s = s->next){ + for (j = 0, s = second; j < COURSE_SIZE && s->next; j++, s = s->next) { /* l1 distance of finesignature */ l1dist = get_l1dist(ctx, sc, f->framesig, s->framesig); - if(l1dist < sc->thl1){ - if(l1dist < pairs[i].dist){ + if (l1dist < sc->thl1) { + if (l1dist < pairs[i].dist) { pairs[i].size = 1; pairs[i].dist = l1dist; pairs[i].b_pos[0] = j; pairs[i].b[0] = s; - } else if(l1dist == pairs[i].dist){ + } else if (l1dist == pairs[i].dist) { pairs[i].b[pairs[i].size] = s; pairs[i].b_pos[pairs[i].size] = j; pairs[i].size++; @@ -229,33 +229,33 @@ static MatchingInfo* get_matching_parameters(AVFilterContext *ctx, SignatureCont } } /* last incomplete coursesignature */ - if(f->next == NULL){ - for(; i < COURSE_SIZE; i++){ + if (f->next == NULL) { + for (; i < COURSE_SIZE; i++) { pairs[i].size = 0; pairs[i].dist = 99999; } } /* hough transformation */ - for(i = 0; i < COURSE_SIZE; i++){ - for(j = 0; j < pairs[i].size; j++){ - for(k = i+1; k < COURSE_SIZE; k++){ - for(l = 0; l < pairs[k].size; l++){ - if(pairs[i].b[j] != pairs[k].b[l]){ + for (i = 0; i < COURSE_SIZE; i++) { + for (j = 0; j < pairs[i].size; j++) { + for (k = i + 1; k < COURSE_SIZE; k++) { + for (l = 0; l < pairs[k].size; l++) { + if (pairs[i].b[j] != pairs[k].b[l]) { /* linear regression */ m = (pairs[k].b_pos[l]-pairs[i].b_pos[j]) / (k-i); /* good value between 0.0 - 2.0 */ framerate = (int) m*30 + 0.5; /* round up to 0 - 60 */ - if(framerate>0 && framerate <= MAX_FRAMERATE){ + if (framerate>0 && framerate <= MAX_FRAMERATE) { offset = pairs[i].b_pos[j] - ((int) m*i + 0.5); /* only second part has to be rounded up */ - if(offset > -HOUGH_MAX_OFFSET && offset < HOUGH_MAX_OFFSET){ - if(pairs[i].dist < pairs[k].dist){ - if(pairs[i].dist < hspace[framerate-1][offset+HOUGH_MAX_OFFSET].dist){ + if (offset > -HOUGH_MAX_OFFSET && offset < HOUGH_MAX_OFFSET) { + if (pairs[i].dist < pairs[k].dist) { + if (pairs[i].dist < hspace[framerate-1][offset+HOUGH_MAX_OFFSET].dist) { hspace[framerate-1][offset+HOUGH_MAX_OFFSET].dist = pairs[i].dist; hspace[framerate-1][offset+HOUGH_MAX_OFFSET].a = pairs[i].a; hspace[framerate-1][offset+HOUGH_MAX_OFFSET].b = pairs[i].b[j]; } } else { - if(pairs[k].dist < hspace[framerate-1][offset+HOUGH_MAX_OFFSET].dist){ + if (pairs[k].dist < hspace[framerate-1][offset+HOUGH_MAX_OFFSET].dist) { hspace[framerate-1][offset+HOUGH_MAX_OFFSET].dist = pairs[k].dist; hspace[framerate-1][offset+HOUGH_MAX_OFFSET].a = pairs[k].a; hspace[framerate-1][offset+HOUGH_MAX_OFFSET].b = pairs[k].b[l]; @@ -263,7 +263,7 @@ static MatchingInfo* get_matching_parameters(AVFilterContext *ctx, SignatureCont } score = hspace[framerate-1][offset+HOUGH_MAX_OFFSET].score + 1; - if(score > hmax) + if (score > hmax ) hmax = score; hspace[framerate-1][offset+HOUGH_MAX_OFFSET].score = score; } @@ -274,12 +274,12 @@ static MatchingInfo* get_matching_parameters(AVFilterContext *ctx, SignatureCont } } - if(hmax > 0){ + if (hmax > 0) { hmax = (int) (0.7*hmax); - for(i = 0; i < MAX_FRAMERATE; i++){ - for(j = 0; j < HOUGH_MAX_OFFSET; j++){ - if(hmax < hspace[i][j].score){ - if(c == NULL){ + for (i = 0; i < MAX_FRAMERATE; i++) { + for (j = 0; j < HOUGH_MAX_OFFSET; j++) { + if (hmax < hspace[i][j].score) { + if (c == NULL) { c = av_malloc(sizeof(MatchingInfo)); if (!c) av_log(ctx, AV_LOG_FATAL, "Could not allocate memory"); @@ -305,7 +305,7 @@ static MatchingInfo* get_matching_parameters(AVFilterContext *ctx, SignatureCont } } } - for(i = 0; i < MAX_FRAMERATE; i++){ + for (i = 0; i < MAX_FRAMERATE; i++) { av_freep(&hspace[i]); } av_freep(&hspace); @@ -320,23 +320,23 @@ static int iterate_frame(double frr, FineSignature **a, FineSignature **b, int f step = ((int) 0.5 + fcount * frr) /* current frame */ -((int) 0.5 + (fcount-1) * frr);/* last frame */ - if(dir == DIR_NEXT){ - if(frr >= 1.0){ - if((*a)->next){ + if (dir == DIR_NEXT) { + if (frr >= 1.0) { + if ((*a)->next) { *a = (*a)->next; } else { return DIR_NEXT_END; } - if(step==1){ - if((*b)->next){ + if (step == 1) { + if ((*b)->next) { *b = (*b)->next; (*bcount)++; } else { return DIR_NEXT_END; } } else { - if((*b)->next && (*b)->next->next){ + if ((*b)->next && (*b)->next->next) { *b = (*b)->next->next; (*bcount)++; } else { @@ -344,21 +344,21 @@ static int iterate_frame(double frr, FineSignature **a, FineSignature **b, int f } } } else { - if((*b)->next){ + if ((*b)->next) { *b = (*b)->next; (*bcount)++; } else { return DIR_NEXT_END; } - if(step==1){ - if((*a)->next){ + if (step == 1) { + if ((*a)->next) { *a = (*a)->next; } else { return DIR_NEXT_END; } } else { - if((*a)->next && (*a)->next->next){ + if ((*a)->next && (*a)->next->next) { *a = (*a)->next->next; } else { return DIR_NEXT_END; @@ -367,22 +367,22 @@ static int iterate_frame(double frr, FineSignature **a, FineSignature **b, int f } return DIR_NEXT; } else { - if(frr >= 1.0){ - if((*a)->prev){ + if (frr >= 1.0) { + if ((*a)->prev) { *a = (*a)->prev; } else { return DIR_PREV_END; } - if(step==1){ - if((*b)->prev){ + if (step == 1) { + if ((*b)->prev) { *b = (*b)->prev; (*bcount)++; } else { return DIR_PREV_END; } } else { - if((*b)->prev && (*b)->prev->prev){ + if ((*b)->prev && (*b)->prev->prev) { *b = (*b)->prev->prev; (*bcount)++; } else { @@ -390,21 +390,21 @@ static int iterate_frame(double frr, FineSignature **a, FineSignature **b, int f } } } else { - if((*b)->prev){ + if ((*b)->prev) { *b = (*b)->prev; (*bcount)++; } else { return DIR_PREV_END; } - if(step==1){ - if((*a)->prev){ + if (step == 1) { + if ((*a)->prev) { *a = (*a)->prev; } else { return DIR_PREV_END; } } else { - if((*a)->prev && (*a)->prev->prev){ + if ((*a)->prev && (*a)->prev->prev) { *a = (*a)->prev->prev; } else { return DIR_PREV_END; @@ -424,22 +424,22 @@ static MatchingInfo evaluate_parameters(AVFilterContext *ctx, SignatureContext * FineSignature *a,*b, *aprev, *bprev; int status = STATUS_NULL; - for(; infos != NULL; infos = infos->next){ + for (; infos != NULL; infos = infos->next) { a = infos->first; b = infos->second; - while(1){ + while (1) { dist = get_l1dist(ctx, sc, a->framesig, b->framesig); - if(dist > sc->thl1){ - if(a->confidence >= 1 || b->confidence >= 1){ + if (dist > sc->thl1) { + if (a->confidence >= 1 || b->confidence >= 1) { /* bad frame (because high different information) */ tolerancecount++; } - if(tolerancecount > 2){ + if (tolerancecount > 2) { a = aprev; b = bprev; - if (dir == DIR_NEXT){ + if (dir == DIR_NEXT) { /* turn around */ a = infos->first; b = infos->second; @@ -457,40 +457,40 @@ static MatchingInfo evaluate_parameters(AVFilterContext *ctx, SignatureContext * aprev = a; bprev = b; - if(a->confidence < 1) gooda++; - if(b->confidence < 1) goodb++; + if (a->confidence < 1) gooda++; + if (b->confidence < 1) goodb++; } fcount++; dir = iterate_frame(infos->framerateratio, &a, &b, fcount, &bcount, dir); - if(dir == DIR_NEXT_END){ + if (dir == DIR_NEXT_END) { status = STATUS_END_REACHED; a = infos->first; b = infos->second; dir = iterate_frame(infos->framerateratio, &a, &b, fcount, &bcount, DIR_PREV); } - if(dir == DIR_PREV_END){ + if (dir == DIR_PREV_END) { status |= STATUS_BEGIN_REACHED; break; } - if(sc->thdi != 0 && bcount >= sc->thdi){ + if (sc->thdi != 0 && bcount >= sc->thdi) { break; /* enough frames found */ } } - if(bcount < sc->thdi) + if (bcount < sc->thdi) continue; /* matching sequence is too short */ - if((double)goodfcount/(double)fcount < sc->thit) + if ((double) goodfcount / (double) fcount < sc->thit) continue; - if((double)goodfcount*0.5 < FFMAX(gooda, goodb)) + if ((double) goodfcount*0.5 < FFMAX(gooda, goodb)) continue; - meandist = (double)goodfcount/(double)distsum; + meandist = (double) goodfcount / (double) distsum; - if(meandist < minmeandist || + if (meandist < minmeandist || status == STATUS_END_REACHED | STATUS_BEGIN_REACHED || mode == MODE_FAST){ minmeandist = meandist; @@ -507,13 +507,13 @@ static MatchingInfo evaluate_parameters(AVFilterContext *ctx, SignatureContext * } /* whole sequence is automatically best match */ - if(status == (STATUS_END_REACHED | STATUS_BEGIN_REACHED)){ + if (status == (STATUS_END_REACHED | STATUS_BEGIN_REACHED)) { bestmatch.whole = 1; break; } /* first matching sequence is enough, finding the best one is not necessary */ - if(mode == MODE_FAST){ + if (mode == MODE_FAST) { break; } } @@ -523,7 +523,7 @@ static MatchingInfo evaluate_parameters(AVFilterContext *ctx, SignatureContext * static void sll_free(MatchingInfo *sll) { void *tmp; - while(sll){ + while (sll) { tmp = sll; sll = sll->next; av_freep(&tmp); @@ -548,26 +548,26 @@ static MatchingInfo lookup_signatures(AVFilterContext *ctx, SignatureContext *sc fill_l1distlut(sc->l1distlut); /* stage 1: course signature matching */ - if(find_next_coursecandidate(sc, second->coursesiglist, &cs, &cs2, 1) == 0) + if (find_next_coursecandidate(sc, second->coursesiglist, &cs, &cs2, 1) == 0) return bestmatch; /* no candidate found */ do { av_log(ctx, AV_LOG_DEBUG, "Stage 1: got coursesignature pair. indices of first frame: %d and %d\n", cs->first->index, cs2->first->index); /* stage 2: l1-distance and hough-transform */ av_log(ctx, AV_LOG_DEBUG, "Stage 2: calculate matching parameters\n"); infos = get_matching_parameters(ctx, sc, cs->first, cs2->first); - if (av_log_get_level() == AV_LOG_DEBUG){ - for(i = infos; i != NULL; i = i->next){ + if (av_log_get_level() == AV_LOG_DEBUG) { + for (i = infos; i != NULL; i = i->next) { av_log(ctx, AV_LOG_DEBUG, "Stage 2: matching pair at %d and %d, ratio %f, offset %d\n", i->first->index, i->second->index, i->framerateratio, i->offset); } } /* stage 3: evaluation */ av_log(ctx, AV_LOG_DEBUG, "Stage 3: evaluate\n"); - if(infos){ + if (infos) { bestmatch = evaluate_parameters(ctx, sc, infos, bestmatch, mode); av_log(ctx, AV_LOG_DEBUG, "Stage 3: best matching pair at %d and %d, ratio %f, offset %d, score %d, %d frames matching\n", bestmatch.first->index, bestmatch.second->index, bestmatch.framerateratio, bestmatch.offset, bestmatch.score, bestmatch.matchframes); sll_free(infos); } - } while(find_next_coursecandidate(sc, second->coursesiglist, &cs, &cs2, 0) && !bestmatch.whole); + } while (find_next_coursecandidate(sc, second->coursesiglist, &cs, &cs2, 0) && !bestmatch.whole); return bestmatch; } diff --git a/libavfilter/vf_signature.c b/libavfilter/vf_signature.c index 7180e38..2b51724 100644 --- a/libavfilter/vf_signature.c +++ b/libavfilter/vf_signature.c @@ -119,7 +119,7 @@ static uint64_t get_block_sum(StreamContext *sc, uint64_t intpic[32][32], const x1 = b->to.x; y1 = b->to.y; - if(x0-1 >= 0 && y0-1 >= 0){ + if (x0-1 >= 0 && y0-1 >= 0) { sum = intpic[y1][x1] + intpic[y0-1][x0-1] - intpic[y1][x0-1] - intpic[y0-1][x1]; }else if(x0-1 >= 0){ sum = intpic[y1][x1] - intpic[y1][x0-1]; @@ -169,7 +169,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) uint64_t conflist[DIFFELEM_SIZE]; int f = 0, g = 0, w = 0; int32_t dh1 = 1, dh2 = 1, dw1 = 1, dw2 = 1, a, b; - int64_t denum; + int64_t denom; int i,j,k,ternary; uint64_t blocksum; int blocksize; @@ -179,14 +179,14 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) int64_t precfactor = (sc->divide) ? 65536 : BLOCK_LCM; /* initialize fs */ - if(sc->curfinesig){ + if (sc->curfinesig) { fs = av_mallocz(sizeof(FineSignature)); if (!fs) return AVERROR(ENOMEM); sc->curfinesig->next = fs; fs->prev = sc->curfinesig; sc->curfinesig = fs; - }else{ + } else { fs = sc->curfinesig = sc->finesiglist; sc->curcoursesig1->first = fs; } @@ -198,13 +198,13 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) intjlut = av_malloc_array(inlink->w, sizeof(int)); if (!intjlut) return AVERROR(ENOMEM); - for (i=0; i < inlink->w; i++){ + for (i = 0; i < inlink->w; i++) { intjlut[i] = (i*32)/inlink->w; } - for (i=0; i < inlink->h; i++){ + for (i = 0; i < inlink->h; i++) { inti = (i*32)/inlink->h; - for (j=0; j< inlink->w; j++){ + for (j = 0; j < inlink->w; j++) { intj = intjlut[j]; intpic[inti][intj] += p[j]; } @@ -212,19 +212,19 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) } av_freep(&intjlut); - /* The following calculate a summed area table (intpic) and brings the numbers - * in intpic to to the same denuminator. + /* The following calculates a summed area table (intpic) and brings the numbers + * in intpic to the same denominator. * So you only have to handle the numinator in the following sections. */ - dh1 = inlink->h/32; - if (inlink->h%32) + dh1 = inlink->h / 32; + if (inlink->h % 32) dh2 = dh1 + 1; - dw1 = inlink->w/32; - if (inlink->w%32) + dw1 = inlink->w / 32; + if (inlink->w % 32) dw2 = dw1 + 1; - denum = (sc->divide) ? dh1 * dh2 * dw1 * dw2 : 1; + denom = (sc->divide) ? dh1 * dh2 * dw1 * dw2 : 1; - for (i=0; i<32; i++){ + for (i = 0; i < 32; i++) { rowcount = 0; a = 1; if (dh2 > 1) { @@ -232,15 +232,15 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) a -= ((inlink->h*i)%32 == 0) ? (inlink->h*i)/32 - 1 : (inlink->h*i)/32; a = (a == dh1)? dh2 : dh1; } - for (j=0; j<32; j++){ + for (j = 0; j < 32; j++) { b = 1; if (dw2 > 1) { b = ((inlink->w*(j+1))%32 == 0) ? (inlink->w*(j+1))/32 - 1 : (inlink->w*(j+1))/32; b -= ((inlink->w*j)%32 == 0) ? (inlink->w*j)/32 - 1 : (inlink->w*j)/32; b = (b == dw1)? dw2 : dw1; } - rowcount += intpic[i][j] * a * b * precfactor / denum; - if(i>0){ + rowcount += intpic[i][j] * a * b * precfactor / denom; + if (i > 0) { intpic[i][j] = intpic[i-1][j] + rowcount; } else { intpic[i][j] = rowcount; @@ -248,9 +248,9 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) } } - denum = (sc->divide) ? 1 : dh1 * dh2 * dw1 * dw2; + denom = (sc->divide) ? 1 : dh1 * dh2 * dw1 * dw2; - for (i=0; i< ELEMENT_COUNT; i++){ + for (i = 0; i < ELEMENT_COUNT; i++) { const ElemCat* elemcat = elements[i]; int64_t* elemsignature; uint64_t* sortsignature; @@ -262,25 +262,25 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) if (!sortsignature) return AVERROR(ENOMEM); - for(j=0; j<elemcat->elem_count; j++) { + for (j = 0; j < elemcat->elem_count; j++) { blocksum = 0; blocksize = 0; - for(k=0; k<elemcat->left_count; k++){ + for (k = 0; k < elemcat->left_count; k++) { blocksum += get_block_sum(sc, intpic, &elemcat->blocks[j*elemcat->block_count+k]); blocksize += get_block_size(&elemcat->blocks[j*elemcat->block_count+k]); } sum = blocksum / blocksize; - if (elemcat->av_elem){ - sum -= 128 * precfactor * denum; + if (elemcat->av_elem) { + sum -= 128 * precfactor * denom; } else { blocksum = 0; blocksize = 0; - for(; k< elemcat->block_count; k++){ + for (; k < elemcat->block_count; k++) { blocksum += get_block_sum(sc, intpic, &elemcat->blocks[j*elemcat->block_count+k]); blocksize += get_block_size(&elemcat->blocks[j*elemcat->block_count+k]); } sum -= blocksum / blocksize; - conflist[g++] = FFABS(sum * 8 / (precfactor * denum)); + conflist[g++] = FFABS(sum * 8 / (precfactor * denom)); } elemsignature[j] = sum; @@ -292,8 +292,8 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) th = sortsignature[(int) (elemcat->elem_count*0.333)]; /* ternarize */ - for(j=0; j<elemcat->elem_count; j++) { - if (elemsignature[j] < -th){ + for (j = 0; j < elemcat->elem_count; j++) { + if (elemsignature[j] < -th) { ternary = 0; } else if (elemsignature[j] <= th) { ternary = 1; @@ -302,7 +302,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) } fs->framesig[f/5] += ternary * pot3[f%5]; - if(f==wordvec[w]){ + if (f == wordvec[w]) { fs->words[s2usw[w]/5] += ternary * pot3[wordt2b[s2usw[w]/5]++]; if (w < 24) w++; @@ -318,8 +318,8 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) fs->confidence = FFMIN(conflist[DIFFELEM_SIZE/2], 255); /* coursesignature */ - if(sc->coursecount == 0){ - if(sc->curcoursesig2){ + if (sc->coursecount == 0) { + if (sc->curcoursesig2) { sc->curcoursesig1 = av_mallocz(sizeof(CourseSignature)); if (!sc->curcoursesig1) return AVERROR(ENOMEM); @@ -328,7 +328,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) sc->courseend = sc->curcoursesig1; } } - if(sc->coursecount == 45){ + if (sc->coursecount == 45) { sc->midcourse = 1; sc->curcoursesig2 = av_mallocz(sizeof(CourseSignature)); if (!sc->curcoursesig2) @@ -337,13 +337,13 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) sc->curcoursesig1->next = sc->curcoursesig2; sc->courseend = sc->curcoursesig2; } - for(i=0; i<5; i++){ + for (i = 0; i < 5; i++) { set_bit(sc->curcoursesig1->data[i], fs->words[i]); } /* assuming the actual frame is the last */ sc->curcoursesig1->last = fs; - if(sc->midcourse){ - for(i=0; i<5; i++){ + if (sc->midcourse) { + for (i = 0; i < 5; i++) { set_bit(sc->curcoursesig2->data[i], fs->words[i]); } sc->curcoursesig2->last = fs; @@ -356,7 +356,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) av_log(ctx, AV_LOG_DEBUG, "input %d, confidence: %d\n", FF_INLINK_IDX(inlink), fs->confidence); av_log(ctx, AV_LOG_DEBUG, "words:"); - for (i=0; i< 5; i++){ + for (i = 0; i < 5; i++) { av_log(ctx, AV_LOG_DEBUG, " %d:", fs->words[i] ); av_log(ctx, AV_LOG_DEBUG, " %d", fs->words[i] / pot3[0] ); for (j = 1; j < 5; j++) @@ -366,7 +366,7 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *picref) av_log(ctx, AV_LOG_DEBUG, "\n"); av_log(ctx, AV_LOG_DEBUG, "framesignature:"); - for (i=0; i< SIGELEM_SIZE/5; i++){ + for (i = 0; i < SIGELEM_SIZE/5; i++) { av_log(ctx, AV_LOG_DEBUG, " %d", fs->framesig[i] / pot3[0] ); for (j = 1; j < 5; j++) av_log(ctx, AV_LOG_DEBUG, ",%d", fs->framesig[i] % pot3[j-1] / pot3[j] ); @@ -389,8 +389,11 @@ static int xml_export(AVFilterContext *ctx, StreamContext *sc, const char* filen f = fopen(filename, "w"); if (!f) { - av_log(ctx, AV_LOG_ERROR, "cannot open xml file %s\n", filename); - return AVERROR(EINVAL); + int err = AVERROR(EINVAL); + char buf[128]; + av_strerror(err, buf, sizeof(buf)); + av_log(ctx, AV_LOG_ERROR, "cannot open xml file %s: %s\n", filename, buf); + return err; } /* header */ @@ -412,7 +415,7 @@ static int xml_export(AVFilterContext *ctx, StreamContext *sc, const char* filen fprintf(f, " </MediaTimeOfSpatialRegion>\n"); /* coursesignatures */ - for(cs = sc->coursesiglist; cs; cs = cs->next){ + for (cs = sc->coursesiglist; cs; cs = cs->next) { fprintf(f, " <VSVideoSegment>\n"); fprintf(f, " <StartFrameOfSegment>%" PRIu32 "</StartFrameOfSegment>\n", cs->first->index); fprintf(f, " <EndFrameOfSegment>%" PRIu32 "</EndFrameOfSegment>\n", cs->last->index); @@ -420,11 +423,11 @@ static int xml_export(AVFilterContext *ctx, StreamContext *sc, const char* filen fprintf(f, " <StartMediaTimeOfSegment>%" PRIu64 "</StartMediaTimeOfSegment>\n", cs->first->pts); fprintf(f, " <EndMediaTimeOfSegment>%" PRIu64 "</EndMediaTimeOfSegment>\n", cs->last->pts); fprintf(f, " </MediaTimeOfSegment>\n"); - for (i=0; i < 5; i++){ + for (i = 0; i < 5; i++) { fprintf(f, " <BagOfWords>"); - for (j=0; j < 31; j++){ + for (j = 0; j < 31; j++) { uint8_t n = cs->data[i][j]; - if (j < 30){ + if (j < 30) { fprintf(f, "%d %d %d %d %d %d %d %d ", (n & 0x80) >> 7, (n & 0x40) >> 6, (n & 0x20) >> 5, @@ -446,24 +449,24 @@ static int xml_export(AVFilterContext *ctx, StreamContext *sc, const char* filen } /* finesignatures */ - for(fs = sc->finesiglist; fs; fs = fs->next){ + for (fs = sc->finesiglist; fs; fs = fs->next) { fprintf(f, " <VideoFrame>\n"); fprintf(f, " <MediaTimeOfFrame>%" PRIu64 "</MediaTimeOfFrame>\n", fs->pts); /* confidence */ fprintf(f, " <FrameConfidence>%d</FrameConfidence>\n", fs->confidence); /* words */ fprintf(f, " <Word>"); - for (i=0; i< 5; i++){ + for (i = 0; i < 5; i++) { fprintf(f, "%d ", fs->words[i]); - if (i < 4){ + if (i < 4) { fprintf(f, " "); } } fprintf(f, "</Word>\n"); /* framesignature */ fprintf(f, " <FrameSignature>"); - for (i=0; i< SIGELEM_SIZE/5; i++){ - if (i>0) { + for (i = 0; i< SIGELEM_SIZE/5; i++) { + if (i > 0) { fprintf(f, " "); } fprintf(f, "%d ", fs->framesig[i] / pot3[0]); @@ -500,8 +503,11 @@ static int binary_export(AVFilterContext *ctx, StreamContext *sc, const char* fi f = fopen(filename, "wb"); if (!f) { - av_log(ctx, AV_LOG_ERROR, "cannot open file %s\n", filename); - return AVERROR(EINVAL); + int err = AVERROR(EINVAL); + char buf[128]; + av_strerror(err, buf, sizeof(buf)); + av_log(ctx, AV_LOG_ERROR, "cannot open file %s: %s\n", filename, buf); + return err; } init_put_bits(&buf, buffer, len); @@ -520,15 +526,15 @@ static int binary_export(AVFilterContext *ctx, StreamContext *sc, const char* fi put_bits32(&buf, 0xFFFFFFFF & sc->courseend->last->pts); /* EndMediaTimeOfSpatialRegion */ put_bits32(&buf, numofsegments); /* NumOfSegments */ /* coursesignatures */ - for(cs = sc->coursesiglist; cs; cs = cs->next){ + for (cs = sc->coursesiglist; cs; cs = cs->next) { put_bits32(&buf, cs->first->index); /* StartFrameOfSegment */ put_bits32(&buf, cs->last->index); /* EndFrameOfSegment */ put_bits(&buf, 1, 1); /* MediaTimeFlagOfSegment */ put_bits32(&buf, 0xFFFFFFFF & cs->first->pts); /* StartMediaTimeOfSegment */ put_bits32(&buf, 0xFFFFFFFF & cs->last->pts); /* EndMediaTimeOfSegment */ - for (i=0; i < 5; i++){ + for (i = 0; i < 5; i++) { /* put 243 bits ( = 7 * 32 + 19 = 8 * 28 + 19) into buffer */ - for (j=0; j < 30; j++){ + for (j = 0; j < 30; j++) { //TODO is it faster to use put_bits32 and shift? put_bits(&buf, 8, cs->data[i][j]); } @@ -537,15 +543,15 @@ static int binary_export(AVFilterContext *ctx, StreamContext *sc, const char* fi } /* finesignatures */ put_bits(&buf, 1, 0); /* CompressionFlag, only 0 supported */ - for(fs = sc->finesiglist; fs; fs = fs->next){ + for (fs = sc->finesiglist; fs; fs = fs->next) { put_bits(&buf, 1, 1); /* MediaTimeFlagOfFrame */ put_bits32(&buf, 0xFFFFFFFF & fs->pts); /* MediaTimeOfFrame */ put_bits(&buf, 8, fs->confidence); /* FrameConfidence */ - for (i=0; i< 5; i++){ + for (i = 0; i < 5; i++) { put_bits(&buf, 8, fs->words[i]); /* Words */ } /* framesignature */ - for (i=0; i< SIGELEM_SIZE/5; i++){ + for (i = 0; i < SIGELEM_SIZE/5; i++) { put_bits(&buf, 8, fs->framesig[i]); } } @@ -637,14 +643,14 @@ static av_cold int init(AVFilterContext *ctx) sc->coursecount = 0; sc->midcourse = 0; - if ((ret = ff_insert_inpad(ctx, i, &pad)) < 0){ + if ((ret = ff_insert_inpad(ctx, i, &pad)) < 0) { av_freep(&pad.name); return ret; } } /* check filename */ - if (sic->nb_inputs > 1 && strlen(sic->filename) > 0 && av_get_frame_filename(tmp, sizeof(tmp), sic->filename, 0) == -1){ + if (sic->nb_inputs > 1 && strlen(sic->filename) > 0 && av_get_frame_filename(tmp, sizeof(tmp), sic->filename, 0) == -1) { av_log(ctx, AV_LOG_ERROR, "The filename must contain %%d or %%0nd, if you have more than one input.\n"); return AVERROR(EINVAL); } @@ -677,21 +683,21 @@ static av_cold void uninit(AVFilterContext *ctx) } /* signature lookup */ - if(sic->mode != MODE_OFF){ + if (sic->mode != MODE_OFF) { /* iterate over every pair */ for (i = 0; i < sic->nb_inputs; i++) { sc = &(sic->streamcontexts[i]); if (sc && sc->curfinesig) { - for (j=i+1; j < sic->nb_inputs; j++){ + for (j = i+1; j < sic->nb_inputs; j++) { sc2 = &(sic->streamcontexts[j]); if (sc2 && sc2->curfinesig) { match = lookup_signatures(ctx, sic, sc, sc2, sic->mode); - if(match.score != 0){ + if (match.score != 0) { av_log(ctx, AV_LOG_INFO, "matching of video %d at %f and %d at %f, %d frames matching\n", i, ((double) match.first->pts * sc->time_base.num) / sc->time_base.den, j, ((double) match.second->pts * sc2->time_base.num) / sc2->time_base.den, match.matchframes); - if(match.whole) + if (match.whole) av_log(ctx, AV_LOG_INFO, "whole video matching\n"); } else { av_log(ctx, AV_LOG_INFO, "no matching of video %d and %d\n", i, j); @@ -703,20 +709,20 @@ static av_cold void uninit(AVFilterContext *ctx) } /* cleanup */ - if (sic->streamcontexts != NULL){ + if (sic->streamcontexts != NULL) { for (i = 0; i < sic->nb_inputs; i++) { sc = &(sic->streamcontexts[i]); finsig = sc->finesiglist; cousig = sc->coursesiglist; - while(finsig){ + while (finsig) { tmp = finsig; finsig = finsig->next; av_freep(&tmp); } sc->finesiglist = NULL; - while(cousig){ + while (cousig) { tmp = cousig; cousig = cousig->next; av_freep(&tmp);
>From fa01efde31ffbeeec6a5e5b528a56d127460e83a Mon Sep 17 00:00:00 2001 From: Gerion Entrup <gerion.ent...@flump.de> Date: Sun, 20 Mar 2016 11:10:31 +0100 Subject: [PATCH] add signature filter for MPEG7 video signature This filter does not implement all features of MPEG7. Missing features: - compression of signature files - work only on (cropped) parts of the video --- Changelog | 1 + configure | 1 + doc/filters.texi | 89 +++++ libavfilter/Makefile | 1 + libavfilter/allfilters.c | 1 + libavfilter/signature.h | 571 ++++++++++++++++++++++++++++++ libavfilter/signature_lookup.c | 573 ++++++++++++++++++++++++++++++ libavfilter/version.h | 2 +- libavfilter/vf_signature.c | 771 +++++++++++++++++++++++++++++++++++++++++ 9 files changed, 2009 insertions(+), 1 deletion(-) create mode 100644 libavfilter/signature.h create mode 100644 libavfilter/signature_lookup.c create mode 100644 libavfilter/vf_signature.c diff --git a/Changelog b/Changelog index 18fdb27..bcf052c 100644 --- a/Changelog +++ b/Changelog @@ -25,6 +25,7 @@ version <next>: - hash and framehash muxers - colorspace filter - hdcd filter +- MPEG-7 Video Signature filter version 3.0: - Common Encryption (CENC) MP4 encoding and decoding support diff --git a/configure b/configure index f981a4e..3bf3592 100755 --- a/configure +++ b/configure @@ -2982,6 +2982,7 @@ showspectrum_filter_deps="avcodec" showspectrum_filter_select="fft" showspectrumpic_filter_deps="avcodec" showspectrumpic_filter_select="fft" +signature_filter_deps="gpl avcodec avformat" smartblur_filter_deps="gpl swscale" sofalizer_filter_deps="netcdf avcodec" sofalizer_filter_select="fft" diff --git a/doc/filters.texi b/doc/filters.texi index 61307e5..534e50a 100644 --- a/doc/filters.texi +++ b/doc/filters.texi @@ -11809,6 +11809,95 @@ saturation maximum: %@{metadata:lavfi.signalstats.SATMAX@} @end example @end itemize +@anchor{signature} +@section signature + +Calculates the MPEG-7 Video Signature. The filter could handle more than one +input. In this case the matching between the inputs could be calculated. The +filter passthrough the first input. The signature of each stream could be written +into a file. + +It accepts the following options: + +@table @option +@item detectmode +Enable or disable the matching process. + +Available values are: + +@table @samp +@item off +Disable the calculation of a matching (default). +@item full +Calculate the mathing for the whole video and output whether the whole video +matches or only parts. +@item fast +Calculate as long as a matching is found or the video ends. Should be faster in +some cases. +@end table + +@item nb_inputs +Set the number of inputs. The option value must be a non negative integer. +Default value is 1. + +@item filename +Set the path to which the output is written. If there is more than one input, +the path must be a prototype, i.e. must contain %d or %0nd (where n is a positive +integer), that will be replaced with the input number. If no filename is +specified, no output will be written. This is the default. + +@item format +Choose the output format. + +Available values are: + +@table @samp +@item binary +Use the specified binary representation (default). +@item xml +Use the specified xml representation. +@end table + +@item th_d +Set threshold to detect one word as similar. The option value must be an integer +greater than zero. The default value is 9000. + +@item th_dc +Set threshold to detect all words as similar. The option value must be an integer +greater than zero. The default value is 60000. + +@item th_xh +Set threshold to detect frames as similar. The option value must be an integer +greater than zero. The default value is 116. + +@item th_di +Set the minimum length of a sequence in frames to recognize it as matching +sequence. The option value must be a non negative integer value. +The default value is 0. + +@item th_it +Set the minimum relation, that matching frames to all frames must have. +The option value must be a double value between 0 and 1. The default value is 0.5. +@end table + +@subsection Examples + +@itemize +@item +To calculate the signature of an input video and store it in signature.bin: +@example +ffmpeg -i input.mkv -vf signature=filename=signature.bin -map 0:v -f null - +@end example + +@item +To detect whether two videos matches and store the signatures in XML format in +signature0.xml and signature1.xml: +@example +ffmpeg -i input1.mkv -i input2.mkv -filter_complex "[0:v][1:v] signature=nb_inputs=2:detectmode=full:format=xml:filename=signature%d.xml" -map :v -f null - +@end example + +@end itemize + @anchor{smartblur} @section smartblur diff --git a/libavfilter/Makefile b/libavfilter/Makefile index 2ea8c84..e8467f9 100644 --- a/libavfilter/Makefile +++ b/libavfilter/Makefile @@ -251,6 +251,7 @@ OBJS-$(CONFIG_SHOWPALETTE_FILTER) += vf_showpalette.o OBJS-$(CONFIG_SHUFFLEFRAMES_FILTER) += vf_shuffleframes.o OBJS-$(CONFIG_SHUFFLEPLANES_FILTER) += vf_shuffleplanes.o OBJS-$(CONFIG_SIGNALSTATS_FILTER) += vf_signalstats.o +OBJS-$(CONFIG_SIGNATURE_FILTER) += vf_signature.o OBJS-$(CONFIG_SMARTBLUR_FILTER) += vf_smartblur.o OBJS-$(CONFIG_SPLIT_FILTER) += split.o OBJS-$(CONFIG_SPP_FILTER) += vf_spp.o diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index 52d3f61..6162821 100644 --- a/libavfilter/allfilters.c +++ b/libavfilter/allfilters.c @@ -271,6 +271,7 @@ void avfilter_register_all(void) REGISTER_FILTER(SHUFFLEFRAMES, shuffleframes, vf); REGISTER_FILTER(SHUFFLEPLANES, shuffleplanes, vf); REGISTER_FILTER(SIGNALSTATS, signalstats, vf); + REGISTER_FILTER(SIGNATURE, signature, vf); REGISTER_FILTER(SMARTBLUR, smartblur, vf); REGISTER_FILTER(SPLIT, split, vf); REGISTER_FILTER(SPP, spp, vf); diff --git a/libavfilter/signature.h b/libavfilter/signature.h new file mode 100644 index 0000000..ed1c32e --- /dev/null +++ b/libavfilter/signature.h @@ -0,0 +1,571 @@ +/* + * Copyright (c) 2016 Gerion Entrup + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with FFmpeg; if not, write to the Free Software Foundation, Inc., + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + */ + +/** + * @file + * MPEG-7 video signature calculation and lookup filter + */ + +#ifndef AVFILTER_SIGNATURE_H +#define AVFILTER_SIGNATURE_H + +#include <float.h> +#include "libavutil/common.h" +#include "libavutil/opt.h" +#include "libavutil/timestamp.h" +#include "avfilter.h" +#include "internal.h" + +#define ELEMENT_COUNT 10 +#define SIGELEM_SIZE 380 +#define DIFFELEM_SIZE 348 /* SIGELEM_SIZE - elem_a1 - elem_a2 */ +#define COURSE_SIZE 90 + +enum lookup_mode { + MODE_OFF, + MODE_FULL, + MODE_FAST, + NB_LOOKUP_MODE +}; + +enum formats { + FORMAT_BINARY, + FORMAT_XML, + NB_FORMATS +}; + +typedef struct { + uint8_t x; + uint8_t y; +} Point; + +typedef struct { + Point up; + Point to; +} Block; + +typedef struct { + int av_elem; /* average element category */ + short left_count; /* count of blocks that will be added together */ + short block_count; /* count of blocks per element */ + short elem_count; + const Block* blocks; +} ElemCat; + +typedef struct FineSignature{ + struct FineSignature* next; + struct FineSignature* prev; + uint64_t pts; + uint32_t index; /* needed for xmlexport */ + uint8_t confidence; + uint8_t words[5]; + uint8_t framesig[SIGELEM_SIZE/5]; +} FineSignature; + +typedef struct CourseSignature{ + uint8_t data[5][31]; /* 5 words with min. 243 bit */ + struct FineSignature* first; /* associated Finesignatures */ + struct FineSignature* last; + struct CourseSignature* next; +} CourseSignature; + +/* lookup types */ +typedef struct MatchingInfo{ + double meandist; + double framerateratio; /* second/first */ + int score; + int offset; + int matchframes; /* number of matching frames */ + int whole; + struct FineSignature* first; + struct FineSignature* second; + struct MatchingInfo* next; +} MatchingInfo; + +typedef struct { + AVRational time_base; + /* needed for xml_export */ + int w; /* height */ + int h; /* width */ + + /* overflow protection */ + int divide; + + FineSignature* finesiglist; + FineSignature* curfinesig; + + CourseSignature* coursesiglist; + CourseSignature* courseend; /* needed for xml export */ + /* helpers to store the alternating signatures */ + CourseSignature* curcoursesig1; + CourseSignature* curcoursesig2; + + int coursecount; /* counter from 0 to 89 */ + int midcourse; /* whether it is a coursesignature beginning from 45 + i * 90 */ + uint32_t lastindex; /* helper to store amount of frames */ +} StreamContext; + +typedef struct { + const AVClass *class; + /* input parameters */ + int mode; + int nb_inputs; + char *filename; + int format; + int thworddist; + int thcomposdist; + int thl1; + int thdi; + int thit; + /* end input parameters */ + + uint8_t l1distlut[243*242/2]; /* 243 + 242 + 241 ... */ + StreamContext* streamcontexts; +} SignatureContext; + + +static const Block elem_a1_data[] = { + {{ 0, 0},{ 7, 7}}, + {{ 8, 0},{15, 7}}, + {{ 0, 8},{ 7,15}}, + {{ 8, 8},{15,15}}, + {{16, 0},{23, 7}}, + {{24, 0},{31, 7}}, + {{16, 8},{23,15}}, + {{24, 8},{31,15}}, + {{ 0,16},{ 7,23}}, + {{ 8,16},{15,23}}, + {{ 0,24},{ 7,31}}, + {{ 8,24},{15,31}}, + {{16,16},{23,23}}, + {{24,16},{31,23}}, + {{16,24},{23,31}}, + {{24,24},{31,31}}, + {{ 0, 0},{15,15}}, + {{16, 0},{31,15}}, + {{ 0,16},{15,31}}, + {{16,16},{31,31}} +}; +static const ElemCat elem_a1 = { 1, 1, 1, 20, elem_a1_data }; + +static const Block elem_a2_data[] = { + {{ 2, 2},{ 9, 9}}, + {{12, 2},{19, 9}}, + {{22, 2},{29, 9}}, + {{ 2,12},{ 9,19}}, + {{12,12},{19,19}}, + {{22,12},{29,19}}, + {{ 2,22},{ 9,29}}, + {{12,22},{19,29}}, + {{22,22},{29,29}}, + {{ 9, 9},{22,22}}, + {{ 6, 6},{25,25}}, + {{ 3, 3},{28,28}} +}; +static const ElemCat elem_a2 = { 1, 1, 1, 12, elem_a2_data }; + +static const Block elem_d1_data[] = { + {{ 0, 0},{ 1, 3}},{{ 2, 0},{ 3, 3}}, + {{ 4, 0},{ 7, 1}},{{ 4, 2},{ 7, 3}}, + {{ 0, 6},{ 3, 7}},{{ 0, 4},{ 3, 5}}, + {{ 6, 4},{ 7, 7}},{{ 4, 4},{ 5, 7}}, + {{ 8, 0},{ 9, 3}},{{10, 0},{11, 3}}, + {{12, 0},{15, 1}},{{12, 2},{15, 3}}, + {{ 8, 6},{11, 7}},{{ 8, 4},{11, 5}}, + {{14, 4},{15, 7}},{{12, 4},{13, 7}}, + {{ 0, 8},{ 1,11}},{{ 2, 8},{ 3,11}}, + {{ 4, 8},{ 7, 9}},{{ 4,10},{ 7,11}}, + {{ 0,14},{ 3,15}},{{ 0,12},{ 3,13}}, + {{ 6,12},{ 7,15}},{{ 4,12},{ 5,15}}, + {{ 8, 8},{ 9,11}},{{10, 8},{11,11}}, + {{12, 8},{15, 9}},{{12,10},{15,11}}, + {{ 8,14},{11,15}},{{ 8,12},{11,13}}, + {{14,12},{15,15}},{{12,12},{13,15}}, + {{16, 0},{19, 1}},{{16, 2},{19, 3}}, + {{22, 0},{23, 3}},{{20, 0},{21, 3}}, + {{16, 4},{17, 7}},{{18, 4},{19, 7}}, + {{20, 6},{23, 7}},{{20, 4},{23, 5}}, + {{24, 0},{27, 1}},{{24, 2},{27, 3}}, + {{30, 0},{31, 3}},{{28, 0},{29, 3}}, + {{24, 4},{25, 7}},{{26, 4},{27, 7}}, + {{28, 6},{31, 7}},{{28, 4},{31, 5}}, + {{16, 8},{19, 9}},{{16,10},{19,11}}, + {{22, 8},{23,11}},{{20, 8},{21,11}}, + {{16,12},{17,15}},{{18,12},{19,15}}, + {{20,14},{23,15}},{{20,12},{23,13}}, + {{24, 8},{27, 9}},{{24,10},{27,11}}, + {{30, 8},{31,11}},{{28, 8},{29,11}}, + {{24,12},{25,15}},{{26,12},{27,15}}, + {{28,14},{31,15}},{{28,12},{31,13}}, + {{ 0,16},{ 3,17}},{{ 0,18},{ 3,19}}, + {{ 6,16},{ 7,19}},{{ 4,16},{ 5,19}}, + {{ 0,20},{ 1,23}},{{ 2,20},{ 3,23}}, + {{ 4,22},{ 7,23}},{{ 4,20},{ 7,21}}, + {{ 8,16},{11,17}},{{ 8,18},{11,19}}, + {{14,16},{15,19}},{{12,16},{13,19}}, + {{ 8,20},{ 9,23}},{{10,20},{11,23}}, + {{12,22},{15,23}},{{12,20},{15,21}}, + {{ 0,24},{ 3,25}},{{ 0,26},{ 3,27}}, + {{ 6,24},{ 7,27}},{{ 4,24},{ 5,27}}, + {{ 0,28},{ 1,31}},{{ 2,28},{ 3,31}}, + {{ 4,30},{ 7,31}},{{ 4,28},{ 7,29}}, + {{ 8,24},{11,25}},{{ 8,26},{11,27}}, + {{14,24},{15,27}},{{12,24},{13,27}}, + {{ 8,28},{ 9,31}},{{10,28},{11,31}}, + {{12,30},{15,31}},{{12,28},{15,29}}, + {{16,16},{17,19}},{{18,16},{19,19}}, + {{20,16},{23,17}},{{20,18},{23,19}}, + {{16,22},{19,23}},{{16,20},{19,21}}, + {{22,20},{23,23}},{{20,20},{21,23}}, + {{24,16},{25,19}},{{26,16},{27,19}}, + {{28,16},{31,17}},{{28,18},{31,19}}, + {{24,22},{27,23}},{{24,20},{27,21}}, + {{30,20},{31,23}},{{28,20},{29,23}}, + {{16,24},{17,27}},{{18,24},{19,27}}, + {{20,24},{23,25}},{{20,26},{23,27}}, + {{16,30},{19,31}},{{16,28},{19,29}}, + {{22,28},{23,31}},{{20,28},{21,31}}, + {{24,24},{25,27}},{{26,24},{27,27}}, + {{28,24},{31,25}},{{28,26},{31,27}}, + {{24,30},{27,31}},{{24,28},{27,29}}, + {{30,28},{31,31}},{{28,28},{29,31}}, + {{ 2, 2},{ 3, 5}},{{ 4, 2},{ 5, 5}}, + {{ 6, 2},{ 9, 3}},{{ 6, 4},{ 9, 5}}, + {{ 2, 8},{ 5, 9}},{{ 2, 6},{ 5, 7}}, + {{ 8, 6},{ 9, 9}},{{ 6, 6},{ 7, 9}}, + {{12, 2},{13, 5}},{{14, 2},{15, 5}}, + {{16, 2},{19, 3}},{{16, 4},{19, 5}}, + {{12, 8},{15, 9}},{{12, 6},{15, 7}}, + {{18, 6},{19, 9}},{{16, 6},{17, 9}}, + {{22, 2},{23, 5}},{{24, 2},{25, 5}}, + {{26, 2},{29, 3}},{{26, 4},{29, 5}}, + {{22, 8},{25, 9}},{{22, 6},{25, 7}}, + {{28, 6},{29, 9}},{{26, 6},{27, 9}}, + {{ 2,12},{ 3,15}},{{ 4,12},{ 5,15}}, + {{ 6,12},{ 9,13}},{{ 6,14},{ 9,15}}, + {{ 2,18},{ 5,19}},{{ 2,16},{ 5,17}}, + {{ 8,16},{ 9,19}},{{ 6,16},{ 7,19}}, + {{12,12},{15,13}},{{12,14},{15,15}}, + {{16,12},{19,13}},{{16,14},{19,15}}, + {{12,18},{15,19}},{{12,16},{15,17}}, + {{16,18},{19,19}},{{16,16},{19,17}}, + {{22,12},{23,15}},{{24,12},{25,15}}, + {{26,12},{29,13}},{{26,14},{29,15}}, + {{22,18},{25,19}},{{22,16},{25,17}}, + {{28,16},{29,19}},{{26,16},{27,19}}, + {{ 2,22},{ 3,25}},{{ 4,22},{ 5,25}}, + {{ 6,22},{ 9,23}},{{ 6,24},{ 9,25}}, + {{ 2,28},{ 5,29}},{{ 2,26},{ 5,27}}, + {{ 8,26},{ 9,29}},{{ 6,26},{ 7,29}}, + {{12,22},{13,25}},{{14,22},{15,25}}, + {{16,22},{19,23}},{{16,24},{19,25}}, + {{12,28},{15,29}},{{12,26},{15,27}}, + {{18,26},{19,29}},{{16,26},{17,29}}, + {{22,22},{23,25}},{{24,22},{25,25}}, + {{26,22},{29,23}},{{26,24},{29,25}}, + {{22,28},{25,29}},{{22,26},{25,27}}, + {{28,26},{29,29}},{{26,26},{27,29}}, + {{ 7, 7},{10, 8}},{{ 7, 9},{10,10}}, + {{11, 7},{12,10}},{{13, 7},{14,10}}, + {{ 7,11},{ 8,14}},{{ 9,11},{10,14}}, + {{11,11},{14,12}},{{11,13},{14,14}}, + {{17, 7},{20, 8}},{{17, 9},{20,10}}, + {{21, 7},{22,10}},{{23, 7},{24,10}}, + {{17,11},{18,14}},{{19,11},{20,14}}, + {{21,11},{24,12}},{{21,13},{24,14}}, + {{ 7,17},{10,18}},{{ 7,19},{10,20}}, + {{11,17},{12,20}},{{13,17},{14,20}}, + {{ 7,21},{ 8,24}},{{ 9,21},{10,24}}, + {{11,21},{14,22}},{{11,23},{14,24}}, + {{17,17},{20,18}},{{17,19},{20,20}}, + {{21,17},{22,20}},{{23,17},{24,20}}, + {{17,21},{18,24}},{{19,21},{20,24}}, + {{21,21},{24,22}},{{21,23},{24,24}} +}; +static const ElemCat elem_d1 = { 0, 1, 2, 116, elem_d1_data }; + +static const Block elem_d2_data[] = { + {{ 0, 0},{ 3, 3}},{{ 4, 4},{ 7, 7}},{{ 4, 0},{ 7, 3}},{{ 0, 4},{ 3, 7}}, + {{ 8, 0},{11, 3}},{{12, 4},{15, 7}},{{12, 0},{15, 3}},{{ 8, 4},{11, 7}}, + {{16, 0},{19, 3}},{{20, 4},{23, 7}},{{20, 0},{23, 3}},{{16, 4},{19, 7}}, + {{24, 0},{27, 3}},{{28, 4},{31, 7}},{{28, 0},{31, 3}},{{24, 4},{27, 7}}, + {{ 0, 8},{ 3,11}},{{ 4,12},{ 7,15}},{{ 4, 8},{ 7,11}},{{ 0,12},{ 3,15}}, + {{ 8, 8},{11,11}},{{12,12},{15,15}},{{12, 8},{15,11}},{{ 8,12},{11,15}}, + {{16, 8},{19,11}},{{20,12},{23,15}},{{20, 8},{23,11}},{{16,12},{19,15}}, + {{24, 8},{27,11}},{{28,12},{31,15}},{{28, 8},{31,11}},{{24,12},{27,15}}, + {{ 0,16},{ 3,19}},{{ 4,20},{ 7,23}},{{ 4,16},{ 7,19}},{{ 0,20},{ 3,23}}, + {{ 8,16},{11,19}},{{12,20},{15,23}},{{12,16},{15,19}},{{ 8,20},{11,23}}, + {{16,16},{19,19}},{{20,20},{23,23}},{{20,16},{23,19}},{{16,20},{19,23}}, + {{24,16},{27,19}},{{28,20},{31,23}},{{28,16},{31,19}},{{24,20},{27,23}}, + {{ 0,24},{ 3,27}},{{ 4,28},{ 7,31}},{{ 4,24},{ 7,27}},{{ 0,28},{ 3,31}}, + {{ 8,24},{11,27}},{{12,28},{15,31}},{{12,24},{15,27}},{{ 8,28},{11,31}}, + {{16,24},{19,27}},{{20,28},{23,31}},{{20,24},{23,27}},{{16,28},{19,31}}, + {{24,24},{27,27}},{{28,28},{31,31}},{{28,24},{31,27}},{{24,28},{27,31}}, + {{ 4, 4},{ 7, 7}},{{ 8, 8},{11,11}},{{ 8, 4},{11, 7}},{{ 4, 8},{ 7,11}}, + {{12, 4},{15, 7}},{{16, 8},{19,11}},{{16, 4},{19, 7}},{{12, 8},{15,11}}, + {{20, 4},{23, 7}},{{24, 8},{27,11}},{{24, 4},{27, 7}},{{20, 8},{23,11}}, + {{ 4,12},{ 7,15}},{{ 8,16},{11,19}},{{ 8,12},{11,15}},{{ 4,16},{ 7,19}}, + {{12,12},{15,15}},{{16,16},{19,19}},{{16,12},{19,15}},{{12,16},{15,19}}, + {{20,12},{23,15}},{{24,16},{27,19}},{{24,12},{27,15}},{{20,16},{23,19}}, + {{ 4,20},{ 7,23}},{{ 8,24},{11,27}},{{ 8,20},{11,23}},{{ 4,24},{ 7,27}}, + {{12,20},{15,23}},{{16,24},{19,27}},{{16,20},{19,23}},{{12,24},{15,27}}, + {{20,20},{23,23}},{{24,24},{27,27}},{{24,20},{27,23}},{{20,24},{23,27}} +}; +static const ElemCat elem_d2 = { 0, 2, 4, 25, elem_d2_data }; + +static const Block elem_d3_data[] = { + {{ 1, 1},{10,10}},{{11, 1},{20,10}}, + {{ 1, 1},{10,10}},{{21, 1},{30,10}}, + {{ 1, 1},{10,10}},{{ 1,11},{10,20}}, + {{ 1, 1},{10,10}},{{11,11},{20,20}}, + {{ 1, 1},{10,10}},{{21,11},{30,20}}, + {{ 1, 1},{10,10}},{{ 1,21},{10,30}}, + {{ 1, 1},{10,10}},{{11,21},{20,30}}, + {{ 1, 1},{10,10}},{{21,21},{30,30}}, + {{11, 1},{20,10}},{{21, 1},{30,10}}, + {{11, 1},{20,10}},{{ 1,11},{10,20}}, + {{11, 1},{20,10}},{{11,11},{20,20}}, + {{11, 1},{20,10}},{{21,11},{30,20}}, + {{11, 1},{20,10}},{{ 1,21},{10,30}}, + {{11, 1},{20,10}},{{11,21},{20,30}}, + {{11, 1},{20,10}},{{21,21},{30,30}}, + {{21, 1},{30,10}},{{ 1,11},{10,20}}, + {{21, 1},{30,10}},{{11,11},{20,20}}, + {{21, 1},{30,10}},{{21,11},{30,20}}, + {{21, 1},{30,10}},{{ 1,21},{10,30}}, + {{21, 1},{30,10}},{{11,21},{20,30}}, + {{21, 1},{30,10}},{{21,21},{30,30}}, + {{ 1,11},{10,20}},{{11,11},{20,20}}, + {{ 1,11},{10,20}},{{21,11},{30,20}}, + {{ 1,11},{10,20}},{{ 1,21},{10,30}}, + {{ 1,11},{10,20}},{{11,21},{20,30}}, + {{ 1,11},{10,20}},{{21,21},{30,30}}, + {{11,11},{20,20}},{{21,11},{30,20}}, + {{11,11},{20,20}},{{ 1,21},{10,30}}, + {{11,11},{20,20}},{{11,21},{20,30}}, + {{11,11},{20,20}},{{21,21},{30,30}}, + {{21,11},{30,20}},{{ 1,21},{10,30}}, + {{21,11},{30,20}},{{11,21},{20,30}}, + {{21,11},{30,20}},{{21,21},{30,30}}, + {{ 1,21},{10,30}},{{11,21},{20,30}}, + {{ 1,21},{10,30}},{{21,21},{30,30}}, + {{11,21},{20,30}},{{21,21},{30,30}} +}; +static const ElemCat elem_d3 = { 0, 1, 2, 36, elem_d3_data }; + +static const Block elem_d4_data[] = { + {{ 7,13},{12,18}},{{19,13},{24,18}}, + {{13, 7},{18,12}},{{13,19},{18,24}}, + {{ 7, 7},{12,12}},{{19,19},{24,24}}, + {{19, 7},{24,12}},{{ 7,19},{12,24}}, + {{13, 7},{18,12}},{{19,13},{24,18}}, + {{19,13},{24,18}},{{13,19},{18,24}}, + {{13,19},{18,24}},{{ 7,13},{12,18}}, + {{ 7,13},{12,18}},{{13, 7},{18,12}}, + {{ 7, 7},{12,12}},{{19, 7},{24,12}}, + {{19, 7},{24,12}},{{19,19},{24,24}}, + {{19,19},{24,24}},{{ 7,19},{12,24}}, + {{ 7,19},{12,24}},{{ 7, 7},{12,12}}, + {{13,13},{18,18}},{{13, 1},{18, 6}}, + {{13,13},{18,18}},{{25,13},{30,18}}, + {{13,13},{18,18}},{{13,25},{18,30}}, + {{13,13},{18,18}},{{ 1,13},{ 6,18}}, + {{13, 1},{18, 6}},{{13,25},{18,30}}, + {{ 1,13},{ 6,18}},{{25,13},{30,18}}, + {{ 7, 1},{12, 6}},{{19, 1},{24, 6}}, + {{ 7,25},{12,30}},{{19,25},{24,30}}, + {{ 1, 7},{ 6,12}},{{ 1,19},{ 6,24}}, + {{25, 7},{30,12}},{{25,19},{30,24}}, + {{ 7, 1},{12, 6}},{{ 1, 7},{ 6,12}}, + {{19, 1},{24, 6}},{{25, 7},{30,12}}, + {{25,19},{30,24}},{{19,25},{24,30}}, + {{ 1,19},{ 6,24}},{{ 7,25},{12,30}}, + {{ 1, 1},{ 6, 6}},{{25, 1},{30, 6}}, + {{25, 1},{30, 6}},{{25,25},{30,30}}, + {{25,25},{30,30}},{{ 1,25},{ 6,30}}, + {{ 1,25},{ 6,30}},{{ 1, 1},{ 6, 6}} +}; +static const ElemCat elem_d4 = { 0, 1, 2, 30, elem_d4_data }; + +static const Block elem_d5_data[] = { + {{ 1, 1},{10, 3}},{{ 1, 4},{ 3, 7}},{{ 8, 4},{10, 7}},{{ 1, 8},{10,10}},{{ 4, 4},{ 7, 7}}, + {{11, 1},{20, 3}},{{11, 4},{13, 7}},{{18, 4},{20, 7}},{{11, 8},{20,10}},{{14, 4},{17, 7}}, + {{21, 1},{30, 3}},{{21, 4},{23, 7}},{{28, 4},{30, 7}},{{21, 8},{30,10}},{{24, 4},{27, 7}}, + {{ 1,11},{10,13}},{{ 1,14},{ 3,17}},{{ 8,14},{10,17}},{{ 1,18},{10,20}},{{ 4,14},{ 7,17}}, + {{11,11},{20,13}},{{11,14},{13,17}},{{18,14},{20,17}},{{11,18},{20,20}},{{14,14},{17,17}}, + {{21,11},{30,13}},{{21,14},{23,17}},{{28,14},{30,17}},{{21,18},{30,20}},{{24,14},{27,17}}, + {{ 1,21},{10,23}},{{ 1,24},{ 3,27}},{{ 8,24},{10,27}},{{ 1,28},{10,30}},{{ 4,24},{ 7,27}}, + {{11,21},{20,23}},{{11,24},{13,27}},{{18,24},{20,27}},{{11,28},{20,30}},{{14,24},{17,27}}, + {{21,21},{30,23}},{{21,24},{23,27}},{{28,24},{30,27}},{{21,28},{30,30}},{{24,24},{27,27}}, + {{ 6, 6},{15, 8}},{{ 6, 9},{ 8,12}},{{13, 9},{15,12}},{{ 6,13},{15,15}},{{ 9, 9},{12,12}}, + {{16, 6},{25, 8}},{{16, 9},{18,12}},{{23, 9},{25,12}},{{16,13},{25,15}},{{19, 9},{22,12}}, + {{ 6,16},{15,18}},{{ 6,19},{ 8,22}},{{13,19},{15,22}},{{ 6,23},{15,25}},{{ 9,19},{12,22}}, + {{16,16},{25,18}},{{16,19},{18,22}},{{23,19},{25,22}},{{16,23},{25,25}},{{19,19},{22,22}}, + {{ 6, 1},{15, 3}},{{ 6, 4},{ 8, 7}},{{13, 4},{15, 7}},{{ 6, 8},{15,10}},{{ 9, 4},{12, 7}}, + {{16, 1},{25, 3}},{{16, 4},{18, 7}},{{23, 4},{25, 7}},{{16, 8},{25,10}},{{19, 4},{22, 7}}, + {{ 1, 6},{10, 8}},{{ 1, 9},{ 3,12}},{{ 8, 9},{10,12}},{{ 1,13},{10,15}},{{ 4, 9},{ 7,12}}, + {{11, 6},{20, 8}},{{11, 9},{13,12}},{{18, 9},{20,12}},{{11,13},{20,15}},{{14, 9},{17,12}}, + {{21, 6},{30, 8}},{{21, 9},{23,12}},{{28, 9},{30,12}},{{21,13},{30,15}},{{24, 9},{27,12}}, + {{ 6,11},{15,13}},{{ 6,14},{ 8,17}},{{13,14},{15,17}},{{ 6,18},{15,20}},{{ 9,14},{12,17}}, + {{16,11},{25,13}},{{16,14},{18,17}},{{23,14},{25,17}},{{16,18},{25,20}},{{19,14},{22,17}}, + {{ 1,16},{10,18}},{{ 1,19},{ 3,22}},{{ 8,19},{10,22}},{{ 1,23},{10,25}},{{ 4,19},{ 7,22}}, + {{11,16},{20,18}},{{11,19},{13,22}},{{18,19},{20,22}},{{11,23},{20,25}},{{14,19},{17,22}}, + {{21,16},{30,18}},{{21,19},{23,22}},{{28,19},{30,22}},{{21,23},{30,25}},{{24,19},{27,22}}, + {{ 6,21},{15,23}},{{ 6,24},{ 8,27}},{{13,24},{15,27}},{{ 6,28},{15,30}},{{ 9,24},{12,27}}, + {{16,21},{25,23}},{{16,24},{18,27}},{{23,24},{25,27}},{{16,28},{25,30}},{{19,24},{22,27}}, + {{ 2, 2},{14, 6}},{{ 2, 7},{ 6, 9}},{{10, 7},{14, 9}},{{ 2,10},{14,14}},{{ 7, 7},{ 9, 9}}, + {{ 7, 2},{19, 6}},{{ 7, 7},{11, 9}},{{15, 7},{19, 9}},{{ 7,10},{19,14}},{{12, 7},{14, 9}}, + {{12, 2},{24, 6}},{{12, 7},{16, 9}},{{20, 7},{24, 9}},{{12,10},{24,14}},{{17, 7},{19, 9}}, + {{17, 2},{29, 6}},{{17, 7},{21, 9}},{{25, 7},{29, 9}},{{17,10},{29,14}},{{22, 7},{24, 9}}, + {{ 2, 7},{14,11}},{{ 2,12},{ 6,14}},{{10,12},{14,14}},{{ 2,15},{14,19}},{{ 7,12},{ 9,14}}, + {{ 7, 7},{19,11}},{{ 7,12},{11,14}},{{15,12},{19,14}},{{ 7,15},{19,19}},{{12,12},{14,14}}, + {{12, 7},{24,11}},{{12,12},{16,14}},{{20,12},{24,14}},{{12,15},{24,19}},{{17,12},{19,14}}, + {{17, 7},{29,11}},{{17,12},{21,14}},{{25,12},{29,14}},{{17,15},{29,19}},{{22,12},{24,14}}, + {{ 2,12},{14,16}},{{ 2,17},{ 6,19}},{{10,17},{14,19}},{{ 2,20},{14,24}},{{ 7,17},{ 9,19}}, + {{ 7,12},{19,16}},{{ 7,17},{11,19}},{{15,17},{19,19}},{{ 7,20},{19,24}},{{12,17},{14,19}}, + {{12,12},{24,16}},{{12,17},{16,19}},{{20,17},{24,19}},{{12,20},{24,24}},{{17,17},{19,19}}, + {{17,12},{29,16}},{{17,17},{21,19}},{{25,17},{29,19}},{{17,20},{29,24}},{{22,17},{24,19}}, + {{ 2,17},{14,21}},{{ 2,22},{ 6,24}},{{10,22},{14,24}},{{ 2,25},{14,29}},{{ 7,22},{ 9,24}}, + {{ 7,17},{19,21}},{{ 7,22},{11,24}},{{15,22},{19,24}},{{ 7,25},{19,29}},{{12,22},{14,24}}, + {{12,17},{24,21}},{{12,22},{16,24}},{{20,22},{24,24}},{{12,25},{24,29}},{{17,22},{19,24}}, + {{17,17},{29,21}},{{17,22},{21,24}},{{25,22},{29,24}},{{17,25},{29,29}},{{22,22},{24,24}}, + {{ 8, 3},{13, 4}},{{ 8, 5},{ 9, 6}},{{12, 5},{13, 6}},{{ 8, 7},{13, 8}},{{10, 5},{11, 6}}, + {{13, 3},{18, 4}},{{13, 5},{14, 6}},{{17, 5},{18, 6}},{{13, 7},{18, 8}},{{15, 5},{16, 6}}, + {{18, 3},{23, 4}},{{18, 5},{19, 6}},{{22, 5},{23, 6}},{{18, 7},{23, 8}},{{20, 5},{21, 6}}, + {{ 3, 8},{ 8, 9}},{{ 3,10},{ 4,11}},{{ 7,10},{ 8,11}},{{ 3,12},{ 8,13}},{{ 5,10},{ 6,11}}, + {{ 8, 8},{13, 9}},{{ 8,10},{ 9,11}},{{12,10},{13,11}},{{ 8,12},{13,13}},{{10,10},{11,11}}, + {{13, 8},{18, 9}},{{13,10},{14,11}},{{17,10},{18,11}},{{13,12},{18,13}},{{15,10},{16,11}}, + {{18, 8},{23, 9}},{{18,10},{19,11}},{{22,10},{23,11}},{{18,12},{23,13}},{{20,10},{21,11}}, + {{23, 8},{28, 9}},{{23,10},{24,11}},{{27,10},{28,11}},{{23,12},{28,13}},{{25,10},{26,11}}, + {{ 3,13},{ 8,14}},{{ 3,15},{ 4,16}},{{ 7,15},{ 8,16}},{{ 3,17},{ 8,18}},{{ 5,15},{ 6,16}}, + {{ 8,13},{13,14}},{{ 8,15},{ 9,16}},{{12,15},{13,16}},{{ 8,17},{13,18}},{{10,15},{11,16}}, + {{13,13},{18,14}},{{13,15},{14,16}},{{17,15},{18,16}},{{13,17},{18,18}},{{15,15},{16,16}}, + {{18,13},{23,14}},{{18,15},{19,16}},{{22,15},{23,16}},{{18,17},{23,18}},{{20,15},{21,16}}, + {{23,13},{28,14}},{{23,15},{24,16}},{{27,15},{28,16}},{{23,17},{28,18}},{{25,15},{26,16}}, + {{ 3,18},{ 8,19}},{{ 3,20},{ 4,21}},{{ 7,20},{ 8,21}},{{ 3,22},{ 8,23}},{{ 5,20},{ 6,21}}, + {{ 8,18},{13,19}},{{ 8,20},{ 9,21}},{{12,20},{13,21}},{{ 8,22},{13,23}},{{10,20},{11,21}}, + {{13,18},{18,19}},{{13,20},{14,21}},{{17,20},{18,21}},{{13,22},{18,23}},{{15,20},{16,21}}, + {{18,18},{23,19}},{{18,20},{19,21}},{{22,20},{23,21}},{{18,22},{23,23}},{{20,20},{21,21}}, + {{23,18},{28,19}},{{23,20},{24,21}},{{27,20},{28,21}},{{23,22},{28,23}},{{25,20},{26,21}}, + {{ 8,23},{13,24}},{{ 8,25},{ 9,26}},{{12,25},{13,26}},{{ 8,27},{13,28}},{{10,25},{11,26}}, + {{13,23},{18,24}},{{13,25},{14,26}},{{17,25},{18,26}},{{13,27},{18,28}},{{15,25},{16,26}}, + {{18,23},{23,24}},{{18,25},{19,26}},{{22,25},{23,26}},{{18,27},{23,28}},{{20,25},{21,26}} +}; +static const ElemCat elem_d5 = { 0, 4, 5, 62, elem_d5_data }; + +static const Block elem_d6_data[] = { + {{ 3, 5},{12,10}},{{ 5, 3},{10,12}}, + {{11, 5},{20,10}},{{13, 3},{18,12}}, + {{19, 5},{28,10}},{{21, 3},{26,12}}, + {{ 3,13},{12,18}},{{ 5,11},{10,20}}, + {{11,13},{20,18}},{{13,11},{18,20}}, + {{19,13},{28,18}},{{21,11},{26,20}}, + {{ 3,21},{12,26}},{{ 5,19},{10,28}}, + {{11,21},{20,26}},{{13,19},{18,28}}, + {{19,21},{28,26}},{{21,19},{26,28}} +}; +static const ElemCat elem_d6 = { 0, 1, 2, 9, elem_d6_data }; + +static const Block elem_d7_data[] = { + {{ 0, 4},{ 3, 7}},{{ 8, 4},{11, 7}},{{ 4, 4},{ 7, 7}}, + {{ 4, 0},{ 7, 3}},{{ 4, 8},{ 7,11}},{{ 4, 4},{ 7, 7}}, + {{ 5, 4},{ 8, 7}},{{13, 4},{16, 7}},{{ 9, 4},{12, 7}}, + {{ 9, 0},{12, 3}},{{ 9, 8},{12,11}},{{ 9, 4},{12, 7}}, + {{10, 4},{13, 7}},{{18, 4},{21, 7}},{{14, 4},{17, 7}}, + {{14, 0},{17, 3}},{{14, 8},{17,11}},{{14, 4},{17, 7}}, + {{15, 4},{18, 7}},{{23, 4},{26, 7}},{{19, 4},{22, 7}}, + {{19, 0},{22, 3}},{{19, 8},{22,11}},{{19, 4},{22, 7}}, + {{20, 4},{23, 7}},{{28, 4},{31, 7}},{{24, 4},{27, 7}}, + {{24, 0},{27, 3}},{{24, 8},{27,11}},{{24, 4},{27, 7}}, + {{ 0, 9},{ 3,12}},{{ 8, 9},{11,12}},{{ 4, 9},{ 7,12}}, + {{ 4, 5},{ 7, 8}},{{ 4,13},{ 7,16}},{{ 4, 9},{ 7,12}}, + {{ 5, 9},{ 8,12}},{{13, 9},{16,12}},{{ 9, 9},{12,12}}, + {{ 9, 5},{12, 8}},{{ 9,13},{12,16}},{{ 9, 9},{12,12}}, + {{10, 9},{13,12}},{{18, 9},{21,12}},{{14, 9},{17,12}}, + {{14, 5},{17, 8}},{{14,13},{17,16}},{{14, 9},{17,12}}, + {{15, 9},{18,12}},{{23, 9},{26,12}},{{19, 9},{22,12}}, + {{19, 5},{22, 8}},{{19,13},{22,16}},{{19, 9},{22,12}}, + {{20, 9},{23,12}},{{28, 9},{31,12}},{{24, 9},{27,12}}, + {{24, 5},{27, 8}},{{24,13},{27,16}},{{24, 9},{27,12}}, + {{ 0,14},{ 3,17}},{{ 8,14},{11,17}},{{ 4,14},{ 7,17}}, + {{ 4,10},{ 7,13}},{{ 4,18},{ 7,21}},{{ 4,14},{ 7,17}}, + {{ 5,14},{ 8,17}},{{13,14},{16,17}},{{ 9,14},{12,17}}, + {{ 9,10},{12,13}},{{ 9,18},{12,21}},{{ 9,14},{12,17}}, + {{10,14},{13,17}},{{18,14},{21,17}},{{14,14},{17,17}}, + {{14,10},{17,13}},{{14,18},{17,21}},{{14,14},{17,17}}, + {{15,14},{18,17}},{{23,14},{26,17}},{{19,14},{22,17}}, + {{19,10},{22,13}},{{19,18},{22,21}},{{19,14},{22,17}}, + {{20,14},{23,17}},{{28,14},{31,17}},{{24,14},{27,17}}, + {{24,10},{27,13}},{{24,18},{27,21}},{{24,14},{27,17}}, + {{ 0,19},{ 3,22}},{{ 8,19},{11,22}},{{ 4,19},{ 7,22}}, + {{ 4,15},{ 7,18}},{{ 4,23},{ 7,26}},{{ 4,19},{ 7,22}}, + {{ 5,19},{ 8,22}},{{13,19},{16,22}},{{ 9,19},{12,22}}, + {{ 9,15},{12,18}},{{ 9,23},{12,26}},{{ 9,19},{12,22}}, + {{10,19},{13,22}},{{18,19},{21,22}},{{14,19},{17,22}}, + {{14,15},{17,18}},{{14,23},{17,26}},{{14,19},{17,22}}, + {{15,19},{18,22}},{{23,19},{26,22}},{{19,19},{22,22}}, + {{19,15},{22,18}},{{19,23},{22,26}},{{19,19},{22,22}}, + {{20,19},{23,22}},{{28,19},{31,22}},{{24,19},{27,22}}, + {{24,15},{27,18}},{{24,23},{27,26}},{{24,19},{27,22}}, + {{ 0,24},{ 3,27}},{{ 8,24},{11,27}},{{ 4,24},{ 7,27}}, + {{ 4,20},{ 7,23}},{{ 4,28},{ 7,31}},{{ 4,24},{ 7,27}}, + {{ 5,24},{ 8,27}},{{13,24},{16,27}},{{ 9,24},{12,27}}, + {{ 9,20},{12,23}},{{ 9,28},{12,31}},{{ 9,24},{12,27}}, + {{10,24},{13,27}},{{18,24},{21,27}},{{14,24},{17,27}}, + {{14,20},{17,23}},{{14,28},{17,31}},{{14,24},{17,27}}, + {{15,24},{18,27}},{{23,24},{26,27}},{{19,24},{22,27}}, + {{19,20},{22,23}},{{19,28},{22,31}},{{19,24},{22,27}}, + {{20,24},{23,27}},{{28,24},{31,27}},{{24,24},{27,27}}, + {{24,20},{27,23}},{{24,28},{27,31}},{{24,24},{27,27}} +}; +static const ElemCat elem_d7 = { 0, 2, 3, 50, elem_d7_data }; + +static const Block elem_d8_data[] = { + {{ 0, 0},{ 7, 3}},{{ 0, 4},{ 7, 7}}, + {{ 8, 0},{11, 7}},{{12, 0},{15, 7}}, + {{ 0, 8},{ 3,15}},{{ 4, 8},{ 7,15}}, + {{ 8, 8},{15,11}},{{ 8,12},{15,15}}, + {{16, 0},{19, 7}},{{20, 0},{23, 7}}, + {{24, 0},{31, 3}},{{24, 4},{31, 7}}, + {{16, 8},{23,11}},{{16,12},{23,15}}, + {{24, 8},{27,15}},{{28, 8},{31,15}}, + {{ 0,16},{ 3,23}},{{ 4,16},{ 7,23}}, + {{ 8,16},{15,19}},{{ 8,20},{15,23}}, + {{ 0,24},{ 7,27}},{{ 0,28},{ 7,31}}, + {{ 8,24},{11,31}},{{12,24},{15,31}}, + {{16,16},{23,19}},{{16,20},{23,23}}, + {{24,16},{27,23}},{{28,16},{31,23}}, + {{16,24},{19,31}},{{20,24},{23,31}}, + {{24,24},{31,27}},{{24,28},{31,31}}, + {{ 0, 0},{ 7,15}},{{ 8, 0},{15,15}}, + {{16, 0},{31, 7}},{{16, 8},{31,15}}, + {{ 0,16},{15,23}},{{ 0,24},{15,31}}, + {{16,16},{23,31}},{{24,16},{31,31}} +}; +static const ElemCat elem_d8 = { 0, 1, 2, 20, elem_d8_data }; + +static const ElemCat* elements[ELEMENT_COUNT] = { &elem_a1, &elem_a2, + &elem_d1, &elem_d2, &elem_d3, &elem_d4, + &elem_d5, &elem_d6, &elem_d7, &elem_d8 }; + +// TODO the lut could be here, with calculation at compile time, but don't know how possible +//static const int l1distlut[255*256/2]; + +#endif /* AVFILTER_SIGNATURE_H */ diff --git a/libavfilter/signature_lookup.c b/libavfilter/signature_lookup.c new file mode 100644 index 0000000..bf55ecd --- /dev/null +++ b/libavfilter/signature_lookup.c @@ -0,0 +1,573 @@ +/* + * Copyright (c) 2016 Gerion Entrup + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with FFmpeg; if not, write to the Free Software Foundation, Inc., + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + */ + +/** + * @file + * MPEG-7 video signature calculation and lookup filter + */ + +#include "signature.h" + +#define HOUGH_MAX_OFFSET 90 +#define MAX_FRAMERATE 60 + +#define DIR_PREV 0 +#define DIR_NEXT 1 +#define DIR_PREV_END 2 +#define DIR_NEXT_END 3 + +#define STATUS_NULL 0 +#define STATUS_END_REACHED 1 +#define STATUS_BEGIN_REACHED 2 + +static void fill_l1distlut(uint8_t lut[]) +{ + int i,j,tmp_i,tmp_j,count; + uint8_t dist; + + for (i = 0, count = 0; i < 242; i++) { + for (j = i + 1; j < 243; j++, count++) { + /* ternary distance between i and j */ + dist = 0; + tmp_i = i; tmp_j = j; + do { + dist += FFABS((tmp_j % 3) - (tmp_i % 3)); + tmp_j /= 3; + tmp_i /= 3; + } while (tmp_i > 0 || tmp_j > 0); + lut[count] = dist; + } + } +} + +static unsigned int intersection_word(const uint8_t *first, const uint8_t *second) +{ + unsigned int val=0,i; + for (i = 0; i < 28; i += 4) { + val += av_popcount( (first[i] & second[i] ) << 24 | + (first[i+1] & second[i+1]) << 16 | + (first[i+2] & second[i+2]) << 8 | + (first[i+3] & second[i+3]) ); + } + val += av_popcount( (first[28] & second[28]) << 16 | + (first[29] & second[29]) << 8 | + (first[30] & second[30]) ); + return val; +} + +static unsigned int union_word(const uint8_t *first, const uint8_t *second) +{ + unsigned int val=0,i; + for (i = 0; i < 28; i += 4) { + val += av_popcount( (first[i] | second[i] ) << 24 | + (first[i+1] | second[i+1]) << 16 | + (first[i+2] | second[i+2]) << 8 | + (first[i+3] | second[i+3]) ); + } + val += av_popcount( (first[28] | second[28]) << 16 | + (first[29] | second[29]) << 8 | + (first[30] | second[30]) ); + return val; +} + +static unsigned int get_l1dist(AVFilterContext *ctx, SignatureContext *sc, const uint8_t *first, const uint8_t *second) +{ + unsigned int i; + unsigned int dist = 0; + uint8_t f,s; + + for (i = 0; i < SIGELEM_SIZE/5; i++) { + if (first[i] != second[i]) { + f = first[i]; + s = second[i]; + if (f > s) { + /* little variation of gauss sum formula */ + dist += sc->l1distlut[243*242/2 - (243-s)*(242-s)/2 + f - s - 1]; + } else { + dist += sc->l1distlut[243*242/2 - (243-f)*(242-f)/2 + s - f - 1]; + } + } + } + return dist; +} + +/** + * calculates the jaccard distance and evaluates a pair of course signatures as good + * @return 0 if pair is bad, 1 otherwise + */ +static int get_jaccarddist(SignatureContext *sc, CourseSignature *first, CourseSignature *second) +{ + int jaccarddist, i, composdist = 0, cwthcount = 0; + for (i = 0; i < 5; i++) { + if ((jaccarddist = intersection_word(first->data[i], second->data[i])) > 0) { + jaccarddist /= union_word(first->data[i], second->data[i]); + } + if (jaccarddist >= sc->thworddist) { + if (++cwthcount > 2) { + /* more than half (5/2) of distances are too wide */ + return 0; + } + } + composdist += jaccarddist; + if (composdist > sc->thcomposdist) { + return 0; + } + } + return 1; +} + +/** + * step through the coursesignatures as long as a good candidate is found + * @return 0 if no candidate is found, 1 otherwise + */ +static int find_next_coursecandidate(SignatureContext *sc, CourseSignature *secondstart, CourseSignature **first, CourseSignature **second, int start) +{ + /* go one coursesignature foreword */ + if (!start) { + if ((*second)->next) { + *second = (*second)->next; + } else if ((*first)->next) { + *second = secondstart; + *first = (*first)->next; + } else { + return 0; + } + } + + while (1) { + if (get_jaccarddist(sc, *first, *second)) + return 1; + + /* next signature */ + if ((*second)->next) { + *second = (*second)->next; + } else if ((*first)->next) { + *second = secondstart; + *first = (*first)->next; + } else { + return 0; + } + } +} + +/** + * compares framesignatures and sorts out signatures with a l1 distance above a given threshold. + * Then tries to find out offset and differences between framerates with a hough transformation + */ +static MatchingInfo* get_matching_parameters(AVFilterContext *ctx, SignatureContext *sc, FineSignature *first, FineSignature *second) +{ + FineSignature *f,*s; + size_t i, j, k, l, hmax = 0, score; + int framerate, offset, l1dist; + double m; + MatchingInfo *cands = NULL, *c = NULL; + + struct { + uint8_t size; + unsigned int dist; + FineSignature *a; + uint8_t b_pos[COURSE_SIZE]; + FineSignature *b[COURSE_SIZE]; + } pairs[COURSE_SIZE]; + + typedef struct { + int dist; + size_t score; + FineSignature *a; + FineSignature *b; + } hspace_elem; + + /* houghspace */ + hspace_elem** hspace = av_malloc_array(MAX_FRAMERATE, sizeof(hspace_elem *)); + + /* initialize houghspace */ + for (i = 0; i < MAX_FRAMERATE; i++) { + hspace[i] = av_malloc_array(2 * HOUGH_MAX_OFFSET + 1, sizeof(hspace_elem)); + for (j = 0; j < HOUGH_MAX_OFFSET; j++) { + hspace[i][j].score = 0; + hspace[i][j].dist = 99999; + } + } + + /* l1 distances */ + for (i = 0, f = first; i < COURSE_SIZE && f->next; i++, f = f->next) { + pairs[i].size = 0; + pairs[i].dist = 99999; + pairs[i].a = f; + for (j = 0, s = second; j < COURSE_SIZE && s->next; j++, s = s->next) { + /* l1 distance of finesignature */ + l1dist = get_l1dist(ctx, sc, f->framesig, s->framesig); + if (l1dist < sc->thl1) { + if (l1dist < pairs[i].dist) { + pairs[i].size = 1; + pairs[i].dist = l1dist; + pairs[i].b_pos[0] = j; + pairs[i].b[0] = s; + } else if (l1dist == pairs[i].dist) { + pairs[i].b[pairs[i].size] = s; + pairs[i].b_pos[pairs[i].size] = j; + pairs[i].size++; + } + } + } + } + /* last incomplete coursesignature */ + if (f->next == NULL) { + for (; i < COURSE_SIZE; i++) { + pairs[i].size = 0; + pairs[i].dist = 99999; + } + } + + /* hough transformation */ + for (i = 0; i < COURSE_SIZE; i++) { + for (j = 0; j < pairs[i].size; j++) { + for (k = i + 1; k < COURSE_SIZE; k++) { + for (l = 0; l < pairs[k].size; l++) { + if (pairs[i].b[j] != pairs[k].b[l]) { + /* linear regression */ + m = (pairs[k].b_pos[l]-pairs[i].b_pos[j]) / (k-i); /* good value between 0.0 - 2.0 */ + framerate = (int) m*30 + 0.5; /* round up to 0 - 60 */ + if (framerate>0 && framerate <= MAX_FRAMERATE) { + offset = pairs[i].b_pos[j] - ((int) m*i + 0.5); /* only second part has to be rounded up */ + if (offset > -HOUGH_MAX_OFFSET && offset < HOUGH_MAX_OFFSET) { + if (pairs[i].dist < pairs[k].dist) { + if (pairs[i].dist < hspace[framerate-1][offset+HOUGH_MAX_OFFSET].dist) { + hspace[framerate-1][offset+HOUGH_MAX_OFFSET].dist = pairs[i].dist; + hspace[framerate-1][offset+HOUGH_MAX_OFFSET].a = pairs[i].a; + hspace[framerate-1][offset+HOUGH_MAX_OFFSET].b = pairs[i].b[j]; + } + } else { + if (pairs[k].dist < hspace[framerate-1][offset+HOUGH_MAX_OFFSET].dist) { + hspace[framerate-1][offset+HOUGH_MAX_OFFSET].dist = pairs[k].dist; + hspace[framerate-1][offset+HOUGH_MAX_OFFSET].a = pairs[k].a; + hspace[framerate-1][offset+HOUGH_MAX_OFFSET].b = pairs[k].b[l]; + } + } + + score = hspace[framerate-1][offset+HOUGH_MAX_OFFSET].score + 1; + if (score > hmax ) + hmax = score; + hspace[framerate-1][offset+HOUGH_MAX_OFFSET].score = score; + } + } + } + } + } + } + } + + if (hmax > 0) { + hmax = (int) (0.7*hmax); + for (i = 0; i < MAX_FRAMERATE; i++) { + for (j = 0; j < HOUGH_MAX_OFFSET; j++) { + if (hmax < hspace[i][j].score) { + if (c == NULL) { + c = av_malloc(sizeof(MatchingInfo)); + if (!c) + av_log(ctx, AV_LOG_FATAL, "Could not allocate memory"); + cands = c; + } else { + c->next = av_malloc(sizeof(MatchingInfo)); + if (!c->next) + av_log(ctx, AV_LOG_FATAL, "Could not allocate memory"); + c = c->next; + } + c->framerateratio = (i+1.0) / 30; + c->score = hspace[i][j].score; + c->offset = j-90; + c->first = hspace[i][j].a; + c->second = hspace[i][j].b; + c->next = NULL; + + /* not used */ + c->meandist = 0; + c->matchframes = 0; + c->whole = 0; + } + } + } + } + for (i = 0; i < MAX_FRAMERATE; i++) { + av_freep(&hspace[i]); + } + av_freep(&hspace); + return cands; +} + +static int iterate_frame(double frr, FineSignature **a, FineSignature **b, int fcount, int *bcount, int dir) +{ + int step; + + /* between 1 and 2, because frr is between 1 and 2 */ + step = ((int) 0.5 + fcount * frr) /* current frame */ + -((int) 0.5 + (fcount-1) * frr);/* last frame */ + + if (dir == DIR_NEXT) { + if (frr >= 1.0) { + if ((*a)->next) { + *a = (*a)->next; + } else { + return DIR_NEXT_END; + } + + if (step == 1) { + if ((*b)->next) { + *b = (*b)->next; + (*bcount)++; + } else { + return DIR_NEXT_END; + } + } else { + if ((*b)->next && (*b)->next->next) { + *b = (*b)->next->next; + (*bcount)++; + } else { + return DIR_NEXT_END; + } + } + } else { + if ((*b)->next) { + *b = (*b)->next; + (*bcount)++; + } else { + return DIR_NEXT_END; + } + + if (step == 1) { + if ((*a)->next) { + *a = (*a)->next; + } else { + return DIR_NEXT_END; + } + } else { + if ((*a)->next && (*a)->next->next) { + *a = (*a)->next->next; + } else { + return DIR_NEXT_END; + } + } + } + return DIR_NEXT; + } else { + if (frr >= 1.0) { + if ((*a)->prev) { + *a = (*a)->prev; + } else { + return DIR_PREV_END; + } + + if (step == 1) { + if ((*b)->prev) { + *b = (*b)->prev; + (*bcount)++; + } else { + return DIR_PREV_END; + } + } else { + if ((*b)->prev && (*b)->prev->prev) { + *b = (*b)->prev->prev; + (*bcount)++; + } else { + return DIR_PREV_END; + } + } + } else { + if ((*b)->prev) { + *b = (*b)->prev; + (*bcount)++; + } else { + return DIR_PREV_END; + } + + if (step == 1) { + if ((*a)->prev) { + *a = (*a)->prev; + } else { + return DIR_PREV_END; + } + } else { + if ((*a)->prev && (*a)->prev->prev) { + *a = (*a)->prev->prev; + } else { + return DIR_PREV_END; + } + } + } + return DIR_PREV; + } +} + +static MatchingInfo evaluate_parameters(AVFilterContext *ctx, SignatureContext *sc, MatchingInfo *infos, MatchingInfo bestmatch, int mode) +{ + int dist, distsum = 0, bcount = 1, dir = DIR_NEXT; + int fcount = 0, goodfcount = 0, gooda = 0, goodb = 0; + double meandist, minmeandist = bestmatch.meandist; + int tolerancecount = 0; + FineSignature *a,*b, *aprev, *bprev; + int status = STATUS_NULL; + + for (; infos != NULL; infos = infos->next) { + a = infos->first; + b = infos->second; + while (1) { + dist = get_l1dist(ctx, sc, a->framesig, b->framesig); + + if (dist > sc->thl1) { + if (a->confidence >= 1 || b->confidence >= 1) { + /* bad frame (because high different information) */ + tolerancecount++; + } + + if (tolerancecount > 2) { + a = aprev; + b = bprev; + if (dir == DIR_NEXT) { + /* turn around */ + a = infos->first; + b = infos->second; + dir = DIR_PREV; + } else { + break; + } + } + } else { + /* good frame */ + distsum += dist; + goodfcount++; + tolerancecount=0; + + aprev = a; + bprev = b; + + if (a->confidence < 1) gooda++; + if (b->confidence < 1) goodb++; + } + + fcount++; + + dir = iterate_frame(infos->framerateratio, &a, &b, fcount, &bcount, dir); + if (dir == DIR_NEXT_END) { + status = STATUS_END_REACHED; + a = infos->first; + b = infos->second; + dir = iterate_frame(infos->framerateratio, &a, &b, fcount, &bcount, DIR_PREV); + } + + if (dir == DIR_PREV_END) { + status |= STATUS_BEGIN_REACHED; + break; + } + + if (sc->thdi != 0 && bcount >= sc->thdi) { + break; /* enough frames found */ + } + } + + if (bcount < sc->thdi) + continue; /* matching sequence is too short */ + if ((double) goodfcount / (double) fcount < sc->thit) + continue; + if ((double) goodfcount*0.5 < FFMAX(gooda, goodb)) + continue; + + meandist = (double) goodfcount / (double) distsum; + + if (meandist < minmeandist || + status == STATUS_END_REACHED | STATUS_BEGIN_REACHED || + mode == MODE_FAST){ + minmeandist = meandist; + /* bestcandidate in this iteration */ + bestmatch.meandist = meandist; + bestmatch.matchframes = bcount; + bestmatch.framerateratio = infos->framerateratio; + bestmatch.score = infos->score; + bestmatch.offset = infos->offset; + bestmatch.first = infos->first; + bestmatch.second = infos->second; + bestmatch.whole = 0; /* will be set to true later */ + bestmatch.next = NULL; + } + + /* whole sequence is automatically best match */ + if (status == (STATUS_END_REACHED | STATUS_BEGIN_REACHED)) { + bestmatch.whole = 1; + break; + } + + /* first matching sequence is enough, finding the best one is not necessary */ + if (mode == MODE_FAST) { + break; + } + } + return bestmatch; +} + +static void sll_free(MatchingInfo *sll) +{ + void *tmp; + while (sll) { + tmp = sll; + sll = sll->next; + av_freep(&tmp); + } +} + +static MatchingInfo lookup_signatures(AVFilterContext *ctx, SignatureContext *sc, StreamContext *first, StreamContext *second, int mode) +{ + CourseSignature *cs, *cs2; + MatchingInfo *infos; + MatchingInfo bestmatch; + MatchingInfo *i; + + cs = first->coursesiglist; + cs2 = second->coursesiglist; + + /* score of bestmatch is 0, if no match is found */ + bestmatch.score = 0; + bestmatch.meandist = 99999; + bestmatch.whole = 0; + + fill_l1distlut(sc->l1distlut); + + /* stage 1: course signature matching */ + if (find_next_coursecandidate(sc, second->coursesiglist, &cs, &cs2, 1) == 0) + return bestmatch; /* no candidate found */ + do { + av_log(ctx, AV_LOG_DEBUG, "Stage 1: got coursesignature pair. indices of first frame: %d and %d\n", cs->first->index, cs2->first->index); + /* stage 2: l1-distance and hough-transform */ + av_log(ctx, AV_LOG_DEBUG, "Stage 2: calculate matching parameters\n"); + infos = get_matching_parameters(ctx, sc, cs->first, cs2->first); + if (av_log_get_level() == AV_LOG_DEBUG) { + for (i = infos; i != NULL; i = i->next) { + av_log(ctx, AV_LOG_DEBUG, "Stage 2: matching pair at %d and %d, ratio %f, offset %d\n", i->first->index, i->second->index, i->framerateratio, i->offset); + } + } + /* stage 3: evaluation */ + av_log(ctx, AV_LOG_DEBUG, "Stage 3: evaluate\n"); + if (infos) { + bestmatch = evaluate_parameters(ctx, sc, infos, bestmatch, mode); + av_log(ctx, AV_LOG_DEBUG, "Stage 3: best matching pair at %d and %d, ratio %f, offset %d, score %d, %d frames matching\n", bestmatch.first->index, bestmatch.second->index, bestmatch.framerateratio, bestmatch.offset, bestmatch.score, bestmatch.matchframes); + sll_free(infos); + } + } while (find_next_coursecandidate(sc, second->coursesiglist, &cs, &cs2, 0) && !bestmatch.whole); + return bestmatch; + +} diff --git a/libavfilter/version.h b/libavfilter/version.h index 48b9d9a..f4ade1c 100644 --- a/libavfilter/version.h +++ b/libavfilter/version.h @@ -30,7 +30,7 @@ #include "libavutil/version.h" #define LIBAVFILTER_VERSION_MAJOR 6 -#define LIBAVFILTER_VERSION_MINOR 42 +#define LIBAVFILTER_VERSION_MINOR 43 #define LIBAVFILTER_VERSION_MICRO 100 #define LIBAVFILTER_VERSION_INT AV_VERSION_INT(LIBAVFILTER_VERSION_MAJOR, \ diff --git a/libavfilter/vf_signature.c b/libavfilter/vf_signature.c new file mode 100644 index 0000000..2b51724 --- /dev/null +++ b/libavfilter/vf_signature.c @@ -0,0 +1,771 @@ +/* + * Copyright (c) 2016 Gerion Entrup + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with FFmpeg; if not, write to the Free Software Foundation, Inc., + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. + */ + +/** + * @file + * MPEG-7 video signature calculation and lookup filter + * @see http://epubs.surrey.ac.uk/531590/1/MPEG-7%20Video%20Signature%20Author%27s%20Copy.pdf + */ + +#include <float.h> +#include "libavcodec/put_bits.h" +#include "libavformat/avformat.h" +#include "libavutil/opt.h" +#include "libavutil/avstring.h" +#include "libavutil/intreadwrite.h" +#include "libavutil/timestamp.h" +#include "avfilter.h" +#include "internal.h" +#include "signature.h" +#include "signature_lookup.c" + +#define OFFSET(x) offsetof(SignatureContext, x) +#define FLAGS AV_OPT_FLAG_VIDEO_PARAM +#define BLOCK_LCM (int64_t) 476985600 + +static const AVOption signature_options[] = { + { "detectmode", "set the detectmode, possible values: off (default), full, fast", + OFFSET(mode), AV_OPT_TYPE_INT, {.i64 = MODE_OFF}, 0, NB_LOOKUP_MODE-1, FLAGS, "mode" }, + { "off", NULL, 0, AV_OPT_TYPE_CONST, {.i64 = MODE_OFF}, 0, 0, .flags = FLAGS, "mode" }, + { "full", NULL, 0, AV_OPT_TYPE_CONST, {.i64 = MODE_FULL}, 0, 0, .flags = FLAGS, "mode" }, + { "fast", NULL, 0, AV_OPT_TYPE_CONST, {.i64 = MODE_FAST}, 0, 0, .flags = FLAGS, "mode" }, + { "nb_inputs", "set number of inputs", + OFFSET(nb_inputs), AV_OPT_TYPE_INT, {.i64 = 1}, 1, INT_MAX, FLAGS }, + { "filename", "set filename for output files", + OFFSET(filename), AV_OPT_TYPE_STRING, {.str = ""}, 0, NB_FORMATS-1, FLAGS }, + { "format", "set output format, possible values: binary (default), xml", + OFFSET(format), AV_OPT_TYPE_INT, {.i64 = FORMAT_BINARY}, 0, 1, FLAGS , "format" }, + { "binary", 0, 0, AV_OPT_TYPE_CONST, {.i64=FORMAT_BINARY}, 0, 0, FLAGS, "format" }, + { "xml", 0, 0, AV_OPT_TYPE_CONST, {.i64=FORMAT_XML}, 0, 0, FLAGS, "format" }, + { "th_d", "set threshold to detect one word as similar", + OFFSET(thworddist), AV_OPT_TYPE_INT, {.i64 = 9000}, 1, INT_MAX, FLAGS }, + { "th_dc", "set threshold to detect all words as similar", + OFFSET(thcomposdist), AV_OPT_TYPE_INT, {.i64 = 60000}, 1, INT_MAX, FLAGS }, + { "th_xh", "set threshold to detect frames as similar", + OFFSET(thl1), AV_OPT_TYPE_INT, {.i64 = 116}, 1, INT_MAX, FLAGS }, + { "th_di", "minimum length of matching sequence in frames", + OFFSET(thdi), AV_OPT_TYPE_INT, {.i64 = 0}, 0, INT_MAX, FLAGS }, + { "th_it", "set threshold for relation of good to all frames", + OFFSET(thit), AV_OPT_TYPE_DOUBLE, {.dbl = 0.5}, 0.0, 1.0, FLAGS }, + { NULL } +}; + +AVFILTER_DEFINE_CLASS(signature); + +static int query_formats(AVFilterContext *ctx) +{ + static const enum AVPixelFormat pix_fmts[] = { + AV_PIX_FMT_GRAY8, + AV_PIX_FMT_YUV410P, AV_PIX_FMT_YUV411P, + AV_PIX_FMT_YUV420P, AV_PIX_FMT_YUV422P, + AV_PIX_FMT_YUV440P, AV_PIX_FMT_YUV444P, + AV_PIX_FMT_YUVJ411P, AV_PIX_FMT_YUVJ420P, + AV_PIX_FMT_YUVJ422P, AV_PIX_FMT_YUVJ444P, + AV_PIX_FMT_YUVJ440P, + AV_PIX_FMT_NV12, AV_PIX_FMT_NV21, + AV_PIX_FMT_NONE + }; + + return ff_set_common_formats(ctx, ff_make_format_list(pix_fmts)); +} + +static int config_input(AVFilterLink *inlink) +{ + AVFilterContext *ctx = inlink->dst; + SignatureContext *sic = ctx->priv; + StreamContext *sc = &(sic->streamcontexts[FF_INLINK_IDX(inlink)]); + + sc->time_base = inlink->time_base; + /* test for overflow */ + sc->divide = (((uint64_t) inlink->w/32) * (inlink->w/32 + 1) * (inlink->h/32 * inlink->h/32 + 1) > INT64_MAX / (BLOCK_LCM * 255)); + if (sc->divide) { + av_log(ctx, AV_LOG_WARNING, "Input dimension too high for precise calculation, numbers will be rounded.\n"); + } + sc->w = inlink->w; + sc->h = inlink->h; + return 0; +} + +static int get_block_size(const Block *b) +{ + return (b->to.y - b->up.y + 1) * (b->to.x - b->up.x + 1); +} + +static uint64_t get_block_sum(StreamContext *sc, uint64_t intpic[32][32], const Block *b) +{ + uint64_t sum = 0; + + int x0,y0,x1,y1; + + x0 = b->up.x; + y0 = b->up.y; + x1 = b->to.x; + y1 = b->to.y; + + if (x0-1 >= 0 && y0-1 >= 0) { + sum = intpic[y1][x1] + intpic[y0-1][x0-1] - intpic[y1][x0-1] - intpic[y0-1][x1]; + }else if(x0-1 >= 0){ + sum = intpic[y1][x1] - intpic[y1][x0-1]; + }else if(y0-1 >= 0){ + sum = intpic[y1][x1] - intpic[y0-1][x1]; + }else{ + sum = intpic[y1][x1]; + } + return sum; +} + +static int cmp(const uint64_t *a, const uint64_t *b) +{ + return *a < *b ? -1 : ( *a > *b ? 1 : 0 ); +} + +/** + * sets the bit at position pos to 1 in data + */ +static void set_bit(uint8_t* data, size_t pos) +{ + uint8_t mask = 1 << 7-(pos%8); + data[pos/8] |= mask; +} + +static int filter_frame(AVFilterLink *inlink, AVFrame *picref) +{ + AVFilterContext *ctx = inlink->dst; + SignatureContext *sic = ctx->priv; + StreamContext *sc = &(sic->streamcontexts[FF_INLINK_IDX(inlink)]); + FineSignature* fs; + + static const uint8_t pot3[5] = { 3*3*3*3, 3*3*3, 3*3, 3, 1 }; + /* indexes of words : 210,217,219,274,334 44,175,233,270,273 57,70,103,237,269 100,285,295,337,354 101,102,111,275,296 + s2usw = sorted to unsorted wordvec: 44 is at index 5, 57 at index 10... + */ + static const unsigned int wordvec[25] = {44,57,70,100,101,102,103,111,175,210,217,219,233,237,269,270,273,274,275,285,295,296,334,337,354}; + static const uint8_t s2usw[25] = { 5,10,11, 15, 20, 21, 12, 22, 6, 0, 1, 2, 7, 13, 14, 8, 9, 3, 23, 16, 17, 24, 4, 18, 19}; + + uint8_t wordt2b[5] = { 0, 0, 0, 0, 0 }; /* word ternary to binary */ + uint64_t intpic[32][32]; + uint64_t rowcount; + uint8_t *p = picref->data[0]; + int inti, intj; + int *intjlut; + + uint64_t conflist[DIFFELEM_SIZE]; + int f = 0, g = 0, w = 0; + int32_t dh1 = 1, dh2 = 1, dw1 = 1, dw2 = 1, a, b; + int64_t denom; + int i,j,k,ternary; + uint64_t blocksum; + int blocksize; + int64_t th; /* threshold */ + int64_t sum; + + int64_t precfactor = (sc->divide) ? 65536 : BLOCK_LCM; + + /* initialize fs */ + if (sc->curfinesig) { + fs = av_mallocz(sizeof(FineSignature)); + if (!fs) + return AVERROR(ENOMEM); + sc->curfinesig->next = fs; + fs->prev = sc->curfinesig; + sc->curfinesig = fs; + } else { + fs = sc->curfinesig = sc->finesiglist; + sc->curcoursesig1->first = fs; + } + + fs->pts = picref->pts; + fs->index = sc->lastindex++; + + memset(intpic, 0, sizeof(uint64_t)*32*32); + intjlut = av_malloc_array(inlink->w, sizeof(int)); + if (!intjlut) + return AVERROR(ENOMEM); + for (i = 0; i < inlink->w; i++) { + intjlut[i] = (i*32)/inlink->w; + } + + for (i = 0; i < inlink->h; i++) { + inti = (i*32)/inlink->h; + for (j = 0; j < inlink->w; j++) { + intj = intjlut[j]; + intpic[inti][intj] += p[j]; + } + p += picref->linesize[0]; + } + av_freep(&intjlut); + + /* The following calculates a summed area table (intpic) and brings the numbers + * in intpic to the same denominator. + * So you only have to handle the numinator in the following sections. + */ + dh1 = inlink->h / 32; + if (inlink->h % 32) + dh2 = dh1 + 1; + dw1 = inlink->w / 32; + if (inlink->w % 32) + dw2 = dw1 + 1; + denom = (sc->divide) ? dh1 * dh2 * dw1 * dw2 : 1; + + for (i = 0; i < 32; i++) { + rowcount = 0; + a = 1; + if (dh2 > 1) { + a = ((inlink->h*(i+1))%32 == 0) ? (inlink->h*(i+1))/32 - 1 : (inlink->h*(i+1))/32; + a -= ((inlink->h*i)%32 == 0) ? (inlink->h*i)/32 - 1 : (inlink->h*i)/32; + a = (a == dh1)? dh2 : dh1; + } + for (j = 0; j < 32; j++) { + b = 1; + if (dw2 > 1) { + b = ((inlink->w*(j+1))%32 == 0) ? (inlink->w*(j+1))/32 - 1 : (inlink->w*(j+1))/32; + b -= ((inlink->w*j)%32 == 0) ? (inlink->w*j)/32 - 1 : (inlink->w*j)/32; + b = (b == dw1)? dw2 : dw1; + } + rowcount += intpic[i][j] * a * b * precfactor / denom; + if (i > 0) { + intpic[i][j] = intpic[i-1][j] + rowcount; + } else { + intpic[i][j] = rowcount; + } + } + } + + denom = (sc->divide) ? 1 : dh1 * dh2 * dw1 * dw2; + + for (i = 0; i < ELEMENT_COUNT; i++) { + const ElemCat* elemcat = elements[i]; + int64_t* elemsignature; + uint64_t* sortsignature; + + elemsignature = av_malloc_array(elemcat->elem_count, sizeof(int64_t)); + if (!elemsignature) + return AVERROR(ENOMEM); + sortsignature = av_malloc_array(elemcat->elem_count, sizeof(int64_t)); + if (!sortsignature) + return AVERROR(ENOMEM); + + for (j = 0; j < elemcat->elem_count; j++) { + blocksum = 0; + blocksize = 0; + for (k = 0; k < elemcat->left_count; k++) { + blocksum += get_block_sum(sc, intpic, &elemcat->blocks[j*elemcat->block_count+k]); + blocksize += get_block_size(&elemcat->blocks[j*elemcat->block_count+k]); + } + sum = blocksum / blocksize; + if (elemcat->av_elem) { + sum -= 128 * precfactor * denom; + } else { + blocksum = 0; + blocksize = 0; + for (; k < elemcat->block_count; k++) { + blocksum += get_block_sum(sc, intpic, &elemcat->blocks[j*elemcat->block_count+k]); + blocksize += get_block_size(&elemcat->blocks[j*elemcat->block_count+k]); + } + sum -= blocksum / blocksize; + conflist[g++] = FFABS(sum * 8 / (precfactor * denom)); + } + + elemsignature[j] = sum; + sortsignature[j] = FFABS(sum); + } + + /* get threshold */ + qsort(sortsignature, elemcat->elem_count, sizeof(uint64_t), (void*) cmp); + th = sortsignature[(int) (elemcat->elem_count*0.333)]; + + /* ternarize */ + for (j = 0; j < elemcat->elem_count; j++) { + if (elemsignature[j] < -th) { + ternary = 0; + } else if (elemsignature[j] <= th) { + ternary = 1; + } else { + ternary = 2; + } + fs->framesig[f/5] += ternary * pot3[f%5]; + + if (f == wordvec[w]) { + fs->words[s2usw[w]/5] += ternary * pot3[wordt2b[s2usw[w]/5]++]; + if (w < 24) + w++; + } + f++; + } + av_freep(&elemsignature); + av_freep(&sortsignature); + } + + /* confidence */ + qsort(conflist, DIFFELEM_SIZE, sizeof(uint64_t), (void*) cmp); + fs->confidence = FFMIN(conflist[DIFFELEM_SIZE/2], 255); + + /* coursesignature */ + if (sc->coursecount == 0) { + if (sc->curcoursesig2) { + sc->curcoursesig1 = av_mallocz(sizeof(CourseSignature)); + if (!sc->curcoursesig1) + return AVERROR(ENOMEM); + sc->curcoursesig1->first = fs; + sc->curcoursesig2->next = sc->curcoursesig1; + sc->courseend = sc->curcoursesig1; + } + } + if (sc->coursecount == 45) { + sc->midcourse = 1; + sc->curcoursesig2 = av_mallocz(sizeof(CourseSignature)); + if (!sc->curcoursesig2) + return AVERROR(ENOMEM); + sc->curcoursesig2->first = fs; + sc->curcoursesig1->next = sc->curcoursesig2; + sc->courseend = sc->curcoursesig2; + } + for (i = 0; i < 5; i++) { + set_bit(sc->curcoursesig1->data[i], fs->words[i]); + } + /* assuming the actual frame is the last */ + sc->curcoursesig1->last = fs; + if (sc->midcourse) { + for (i = 0; i < 5; i++) { + set_bit(sc->curcoursesig2->data[i], fs->words[i]); + } + sc->curcoursesig2->last = fs; + } + + sc->coursecount = (sc->coursecount+1)%90; + + /* debug printing finesignature */ + if (av_log_get_level() == AV_LOG_DEBUG) { + av_log(ctx, AV_LOG_DEBUG, "input %d, confidence: %d\n", FF_INLINK_IDX(inlink), fs->confidence); + + av_log(ctx, AV_LOG_DEBUG, "words:"); + for (i = 0; i < 5; i++) { + av_log(ctx, AV_LOG_DEBUG, " %d:", fs->words[i] ); + av_log(ctx, AV_LOG_DEBUG, " %d", fs->words[i] / pot3[0] ); + for (j = 1; j < 5; j++) + av_log(ctx, AV_LOG_DEBUG, ",%d", fs->words[i] % pot3[j-1] / pot3[j] ); + av_log(ctx, AV_LOG_DEBUG, ";"); + } + av_log(ctx, AV_LOG_DEBUG, "\n"); + + av_log(ctx, AV_LOG_DEBUG, "framesignature:"); + for (i = 0; i < SIGELEM_SIZE/5; i++) { + av_log(ctx, AV_LOG_DEBUG, " %d", fs->framesig[i] / pot3[0] ); + for (j = 1; j < 5; j++) + av_log(ctx, AV_LOG_DEBUG, ",%d", fs->framesig[i] % pot3[j-1] / pot3[j] ); + } + av_log(ctx, AV_LOG_DEBUG, "\n"); + } + + if (FF_INLINK_IDX(inlink) == 0) + return ff_filter_frame(inlink->dst->outputs[0], picref); + return 1; +} + +static int xml_export(AVFilterContext *ctx, StreamContext *sc, const char* filename) +{ + FineSignature* fs; + CourseSignature* cs; + int i,j; + FILE* f; + unsigned int pot3[5] = { 3*3*3*3, 3*3*3, 3*3, 3, 1 }; + + f = fopen(filename, "w"); + if (!f) { + int err = AVERROR(EINVAL); + char buf[128]; + av_strerror(err, buf, sizeof(buf)); + av_log(ctx, AV_LOG_ERROR, "cannot open xml file %s: %s\n", filename, buf); + return err; + } + + /* header */ + fprintf(f, "<?xml version='1.0' encoding='ASCII' ?>\n"); + fprintf(f, "<Mpeg7 xmlns=\"urn:mpeg:mpeg7:schema:2001\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xsi:schemaLocation=\"urn:mpeg:mpeg7:schema:2001 schema/Mpeg7-2001.xsd\">\n"); + fprintf(f, " <DescriptionUnit xsi:type=\"DescriptorCollectionType\">\n"); + fprintf(f, " <Descriptor xsi:type=\"VideoSignatureType\">\n"); + fprintf(f, " <VideoSignatureRegion>\n"); + fprintf(f, " <VideoSignatureSpatialRegion>\n"); + fprintf(f, " <Pixel>0 0 </Pixel>\n"); + fprintf(f, " <Pixel>%d %d </Pixel>\n", sc->w - 1, sc->h - 1); + fprintf(f, " </VideoSignatureSpatialRegion>\n"); + fprintf(f, " <StartFrameOfSpatialRegion>0</StartFrameOfSpatialRegion>\n"); + /* hoping num is 1, other values are vague */ + fprintf(f, " <MediaTimeUnit>%d</MediaTimeUnit>\n", sc->time_base.den / sc->time_base.num); + fprintf(f, " <MediaTimeOfSpatialRegion>\n"); + fprintf(f, " <StartMediaTimeOfSpatialRegion>0</StartMediaTimeOfSpatialRegion>\n"); + fprintf(f, " <EndMediaTimeOfSpatialRegion>%" PRIu64 "</EndMediaTimeOfSpatialRegion>\n", sc->courseend->last->pts); + fprintf(f, " </MediaTimeOfSpatialRegion>\n"); + + /* coursesignatures */ + for (cs = sc->coursesiglist; cs; cs = cs->next) { + fprintf(f, " <VSVideoSegment>\n"); + fprintf(f, " <StartFrameOfSegment>%" PRIu32 "</StartFrameOfSegment>\n", cs->first->index); + fprintf(f, " <EndFrameOfSegment>%" PRIu32 "</EndFrameOfSegment>\n", cs->last->index); + fprintf(f, " <MediaTimeOfSegment>\n"); + fprintf(f, " <StartMediaTimeOfSegment>%" PRIu64 "</StartMediaTimeOfSegment>\n", cs->first->pts); + fprintf(f, " <EndMediaTimeOfSegment>%" PRIu64 "</EndMediaTimeOfSegment>\n", cs->last->pts); + fprintf(f, " </MediaTimeOfSegment>\n"); + for (i = 0; i < 5; i++) { + fprintf(f, " <BagOfWords>"); + for (j = 0; j < 31; j++) { + uint8_t n = cs->data[i][j]; + if (j < 30) { + fprintf(f, "%d %d %d %d %d %d %d %d ", (n & 0x80) >> 7, + (n & 0x40) >> 6, + (n & 0x20) >> 5, + (n & 0x10) >> 4, + (n & 0x08) >> 3, + (n & 0x04) >> 2, + (n & 0x02) >> 1, + (n & 0x01)); + } else { + /* print only 3 bit in last byte */ + fprintf(f, "%d %d %d ", (n & 0x80) >> 7, + (n & 0x40) >> 6, + (n & 0x20) >> 5); + } + } + fprintf(f, "</BagOfWords>\n"); + } + fprintf(f, " </VSVideoSegment>\n"); + } + + /* finesignatures */ + for (fs = sc->finesiglist; fs; fs = fs->next) { + fprintf(f, " <VideoFrame>\n"); + fprintf(f, " <MediaTimeOfFrame>%" PRIu64 "</MediaTimeOfFrame>\n", fs->pts); + /* confidence */ + fprintf(f, " <FrameConfidence>%d</FrameConfidence>\n", fs->confidence); + /* words */ + fprintf(f, " <Word>"); + for (i = 0; i < 5; i++) { + fprintf(f, "%d ", fs->words[i]); + if (i < 4) { + fprintf(f, " "); + } + } + fprintf(f, "</Word>\n"); + /* framesignature */ + fprintf(f, " <FrameSignature>"); + for (i = 0; i< SIGELEM_SIZE/5; i++) { + if (i > 0) { + fprintf(f, " "); + } + fprintf(f, "%d ", fs->framesig[i] / pot3[0]); + for (j = 1; j < 5; j++) + fprintf(f, " %d ", fs->framesig[i] % pot3[j-1] / pot3[j] ); + } + fprintf(f, "</FrameSignature>\n"); + fprintf(f, " </VideoFrame>\n"); + } + fprintf(f, " </VideoSignatureRegion>\n"); + fprintf(f, " </Descriptor>\n"); + fprintf(f, " </DescriptionUnit>\n"); + fprintf(f, "</Mpeg7>\n"); + + fclose(f); + return 0; +} + +static int binary_export(AVFilterContext *ctx, StreamContext *sc, const char* filename) +{ + FILE* f; + FineSignature* fs; + CourseSignature* cs; + uint32_t numofsegments = (sc->lastindex + 44)/45; + int i,j; + PutBitContext buf; + /* buffer + header + coursesignatures + finesignature */ + int len = (512 + 6 * 32 + 3*16 + 2 + + numofsegments * (4*32 + 1 + 5*243) + + sc->lastindex * (2 + 32 + 6*8 + 608)) / 8; + uint8_t* buffer = av_malloc_array(len, sizeof(uint8_t)); + if (!buffer) + return AVERROR(ENOMEM); + + f = fopen(filename, "wb"); + if (!f) { + int err = AVERROR(EINVAL); + char buf[128]; + av_strerror(err, buf, sizeof(buf)); + av_log(ctx, AV_LOG_ERROR, "cannot open file %s: %s\n", filename, buf); + return err; + } + init_put_bits(&buf, buffer, len); + + put_bits32(&buf, 1); /* NumOfSpatial Regions, only 1 supported */ + put_bits(&buf, 1, 1); /* SpatialLocationFlag, always the whole image */ + put_bits32(&buf, 0); /* PixelX,1 PixelY,1, 0,0 */ + put_bits(&buf, 16, sc->w-1 & 0xFFFF); /* PixelX,2 */ + put_bits(&buf, 16, sc->h-1 & 0xFFFF); /* PixelY,2 */ + put_bits32(&buf, 0); /* StartFrameOfSpatialRegion */ + put_bits32(&buf, sc->lastindex); /* NumOfFrames */ + /* hoping num is 1, other values are vague */ + /* den/num might be greater than 16 bit, so cutting it */ + put_bits(&buf, 16, 0xFFFF & (sc->time_base.den / sc->time_base.num)); /* MediaTimeUnit */ + put_bits(&buf, 1, 1); /* MediaTimeFlagOfSpatialRegion */ + put_bits32(&buf, 0); /* StartMediaTimeOfSpatialRegion */ + put_bits32(&buf, 0xFFFFFFFF & sc->courseend->last->pts); /* EndMediaTimeOfSpatialRegion */ + put_bits32(&buf, numofsegments); /* NumOfSegments */ + /* coursesignatures */ + for (cs = sc->coursesiglist; cs; cs = cs->next) { + put_bits32(&buf, cs->first->index); /* StartFrameOfSegment */ + put_bits32(&buf, cs->last->index); /* EndFrameOfSegment */ + put_bits(&buf, 1, 1); /* MediaTimeFlagOfSegment */ + put_bits32(&buf, 0xFFFFFFFF & cs->first->pts); /* StartMediaTimeOfSegment */ + put_bits32(&buf, 0xFFFFFFFF & cs->last->pts); /* EndMediaTimeOfSegment */ + for (i = 0; i < 5; i++) { + /* put 243 bits ( = 7 * 32 + 19 = 8 * 28 + 19) into buffer */ + for (j = 0; j < 30; j++) { + //TODO is it faster to use put_bits32 and shift? + put_bits(&buf, 8, cs->data[i][j]); + } + put_bits(&buf, 3, cs->data[i][30] >> 5); + } + } + /* finesignatures */ + put_bits(&buf, 1, 0); /* CompressionFlag, only 0 supported */ + for (fs = sc->finesiglist; fs; fs = fs->next) { + put_bits(&buf, 1, 1); /* MediaTimeFlagOfFrame */ + put_bits32(&buf, 0xFFFFFFFF & fs->pts); /* MediaTimeOfFrame */ + put_bits(&buf, 8, fs->confidence); /* FrameConfidence */ + for (i = 0; i < 5; i++) { + put_bits(&buf, 8, fs->words[i]); /* Words */ + } + /* framesignature */ + for (i = 0; i < SIGELEM_SIZE/5; i++) { + put_bits(&buf, 8, fs->framesig[i]); + } + } + + avpriv_align_put_bits(&buf); + flush_put_bits(&buf); + fwrite(buffer, 1, put_bits_count(&buf)/8, f); + fclose(f); + av_freep(&buffer); + return 0; +} + +static int export(AVFilterContext *ctx, StreamContext *sc, int input) +{ + SignatureContext* sic = ctx->priv; + char filename[1024]; + + if (sic->nb_inputs > 1) { + /* error already handled */ + av_get_frame_filename(filename, sizeof(filename), sic->filename, input); + } else { + strcpy(filename, sic->filename); + } + if (sic->format == FORMAT_XML) { + return xml_export(ctx, sc, filename); + } else { + return binary_export(ctx, sc, filename); + } + +} + +static int request_frame(AVFilterLink *outlink) +{ + AVFilterContext *ctx = outlink->src; + SignatureContext *sc = ctx->priv; + int i, ret; + + for (i = 0; i < sc->nb_inputs; i++){ + ret = ff_request_frame(ctx->inputs[i]); + // TODO handle this in a better way? + // Problem is the following: + // Assuming two inputs, inputA with 50 frames, inputB with 100 frames + // simply returning ret when < 0 would result in not filtering inputB + // after 50 frames anymore, not wanted + // only returning ret at the end would result in only respecting the error + // values of the last input, please comment + if (ret < 0 && ret != AVERROR_EOF) + return ret; + } + return ret; +} + +static av_cold int init(AVFilterContext *ctx) +{ + + SignatureContext *sic = ctx->priv; + StreamContext *sc; + int i, ret; + char tmp[1024]; + + sic->streamcontexts = av_mallocz(sic->nb_inputs * sizeof(StreamContext)); + if (!sic->streamcontexts) + return AVERROR(ENOMEM); + + for (i = 0; i < sic->nb_inputs; i++) { + AVFilterPad pad = { + .type = AVMEDIA_TYPE_VIDEO, + .name = av_asprintf("in%d", i), + .config_props = config_input, + .filter_frame = filter_frame, + }; + + if (!pad.name) + return AVERROR(ENOMEM); + + sc = &(sic->streamcontexts[i]); + + sc->lastindex = 0; + sc->finesiglist = av_mallocz(sizeof(FineSignature)); + if (!sc->finesiglist) + return AVERROR(ENOMEM); + sc->curfinesig = NULL; + + sc->coursesiglist = av_mallocz(sizeof(CourseSignature)); + if (!sc->coursesiglist) + return AVERROR(ENOMEM); + sc->curcoursesig1 = sc->coursesiglist; + sc->courseend = sc->coursesiglist; + sc->coursecount = 0; + sc->midcourse = 0; + + if ((ret = ff_insert_inpad(ctx, i, &pad)) < 0) { + av_freep(&pad.name); + return ret; + } + } + + /* check filename */ + if (sic->nb_inputs > 1 && strlen(sic->filename) > 0 && av_get_frame_filename(tmp, sizeof(tmp), sic->filename, 0) == -1) { + av_log(ctx, AV_LOG_ERROR, "The filename must contain %%d or %%0nd, if you have more than one input.\n"); + return AVERROR(EINVAL); + } + + return 0; +} + + + +static av_cold void uninit(AVFilterContext *ctx) +{ + SignatureContext *sic = ctx->priv; + StreamContext *sc, *sc2; + void* tmp; + FineSignature* finsig; + CourseSignature* cousig; + MatchingInfo match; + int i,j; + + //TODO export and especially lookup_signature can have a return value to show some error etc. + //How could this be handled in this function? + + /* signature export */ + if (strlen(sic->filename) > 0) { + for (i = 0; i < sic->nb_inputs; i++) { + /* if filter frame was called at least once */ + if (&(sic->streamcontexts[i]) && (&(sic->streamcontexts[i]))->curfinesig) + export(ctx, &(sic->streamcontexts[i]), i); + } + } + + /* signature lookup */ + if (sic->mode != MODE_OFF) { + /* iterate over every pair */ + for (i = 0; i < sic->nb_inputs; i++) { + sc = &(sic->streamcontexts[i]); + if (sc && sc->curfinesig) { + for (j = i+1; j < sic->nb_inputs; j++) { + sc2 = &(sic->streamcontexts[j]); + if (sc2 && sc2->curfinesig) { + match = lookup_signatures(ctx, sic, sc, sc2, sic->mode); + if (match.score != 0) { + av_log(ctx, AV_LOG_INFO, "matching of video %d at %f and %d at %f, %d frames matching\n", + i, ((double) match.first->pts * sc->time_base.num) / sc->time_base.den, + j, ((double) match.second->pts * sc2->time_base.num) / sc2->time_base.den, + match.matchframes); + if (match.whole) + av_log(ctx, AV_LOG_INFO, "whole video matching\n"); + } else { + av_log(ctx, AV_LOG_INFO, "no matching of video %d and %d\n", i, j); + } + } + } + } + } + } + + /* cleanup */ + if (sic->streamcontexts != NULL) { + for (i = 0; i < sic->nb_inputs; i++) { + sc = &(sic->streamcontexts[i]); + finsig = sc->finesiglist; + cousig = sc->coursesiglist; + + while (finsig) { + tmp = finsig; + finsig = finsig->next; + av_freep(&tmp); + } + sc->finesiglist = NULL; + + while (cousig) { + tmp = cousig; + cousig = cousig->next; + av_freep(&tmp); + } + sc->coursesiglist = NULL; + } + av_freep(&sic->streamcontexts); + } +} + +static int config_output(AVFilterLink *outlink) +{ + AVFilterContext *ctx = outlink->src; + AVFilterLink *inlink = ctx->inputs[0]; + + outlink->time_base = inlink->time_base; + outlink->frame_rate = inlink->frame_rate; + outlink->sample_aspect_ratio = inlink->sample_aspect_ratio; + outlink->w = inlink->w; + outlink->h = inlink->h; + + return 0; +} + +static const AVFilterPad signature_outputs[] = { + { + .name = "default", + .type = AVMEDIA_TYPE_VIDEO, + .request_frame = request_frame, + .config_props = config_output, + }, + { NULL } +}; + +AVFilter ff_vf_signature = { + .name = "signature", + .description = NULL_IF_CONFIG_SMALL("Calculate the MPEG-7 video signature"), + .priv_size = sizeof(SignatureContext), + .priv_class = &signature_class, + .init = init, + .uninit = uninit, + .query_formats = query_formats, + .outputs = signature_outputs, + .inputs = NULL, + .flags = AVFILTER_FLAG_DYNAMIC_INPUTS, +}; -- 2.7.3
_______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel