ask for comment or merge, thanks.
> -----Original Message----- > From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf > Of Guo, Yejun > Sent: Monday, October 29, 2018 11:19 AM > To: ffmpeg-devel@ffmpeg.org > Subject: Re: [FFmpeg-devel] [PATCH V4] Add a filter implementing HDR > image generation from a single exposure using deep CNNs > > any more comment? thanks. > > > -----Original Message----- > > From: Guo, Yejun > > Sent: Tuesday, October 23, 2018 6:46 AM > > To: ffmpeg-devel@ffmpeg.org > > Cc: Guo, Yejun <yejun....@intel.com>; Guo > > Subject: [PATCH V4] Add a filter implementing HDR image generation > > from a single exposure using deep CNNs > > > > see the algorithm's paper and code below. > > > > the filter's parameter looks like: > > > sdr2hdr=model_filename=/path_to_tensorflow_graph.pb:out_fmt=gbrp10l > > e > > > > The input of the deep CNN model is RGB24 while the output is float for > > each color channel. This is the filter's default behavior to output > > format with gbrpf32le. And gbrp10le is also supported as the output, > > so we can see the rendering result in a player, as a reference. > > > > To generate the model file, we need modify the original script a little. > > - set name='y' for y_final within script at > > https://github.com/gabrieleilertsen/hdrcnn/blob/master/network.py > > - add the following code to the script at > > https://github.com/gabrieleilertsen/hdrcnn/blob/master/hdrcnn_predict. > > py > > > > graph = tf.graph_util.convert_variables_to_constants(sess, > > sess.graph_def, > > ["y"]) tf.train.write_graph(graph, '.', 'graph.pb', as_text=False) > > > > The filter only works when tensorflow C api is supported in the > > system, native backend is not supported since there are some different > > types of layers in the deep CNN model, besides CONV and > DEPTH_TO_SPACE. > > > > https://arxiv.org/pdf/1710.07480.pdf: > > author = "Eilertsen, Gabriel and Kronander, Joel, and Denes, Gyorgy > and > > Mantiuk, RafaĆ and Unger, Jonas", > > title = "HDR image reconstruction from a single exposure using deep > > CNNs", > > journal = "ACM Transactions on Graphics (TOG)", > > number = "6", > > volume = "36", > > articleno = "178", > > year = "2017" > > > > https://github.com/gabrieleilertsen/hdrcnn > > > > btw, as a whole solution, metadata should also be generated from the > > sdr video, so to be encoded as a HDR video. Not supported yet. > > This patch just focuses on this paper. > > > > Signed-off-by: Guo, Yejun <yejun....@intel.com> > > --- > > configure | 1 + > > doc/filters.texi | 35 +++++++ > > libavfilter/Makefile | 1 + > > libavfilter/allfilters.c | 1 + > > libavfilter/vf_sdr2hdr.c | 268 > > +++++++++++++++++++++++++++++++++++++++++++++++ > > 5 files changed, 306 insertions(+) > > create mode 100644 libavfilter/vf_sdr2hdr.c > > > > diff --git a/configure b/configure > > index 85d5dd5..5e2efba 100755 > > --- a/configure > > +++ b/configure > > @@ -3438,6 +3438,7 @@ scale2ref_filter_deps="swscale" > > scale_filter_deps="swscale" > > scale_qsv_filter_deps="libmfx" > > select_filter_select="pixelutils" > > +sdr2hdr_filter_deps="libtensorflow" > > sharpness_vaapi_filter_deps="vaapi" > > showcqt_filter_deps="avcodec avformat swscale" > > showcqt_filter_suggest="libfontconfig libfreetype" > > diff --git a/doc/filters.texi b/doc/filters.texi index > > 17e2549..bba9f87 100644 > > --- a/doc/filters.texi > > +++ b/doc/filters.texi > > @@ -14672,6 +14672,41 @@ Scale a subtitle stream (b) to match the main > > video (a) in size before overlayin @end example @end itemize > > > > +@section sdr2hdr > > + > > +HDR image generation from a single exposure using deep CNNs with > > TensorFlow C library. > > + > > +@itemize > > +@item > > +paper: see @url{https://arxiv.org/pdf/1710.07480.pdf} > > + > > +@item > > +code with model and trained parameters: see > > +@url{https://github.com/gabrieleilertsen/hdrcnn} > > +@end itemize > > + > > +The filter accepts the following options: > > + > > +@table @option > > + > > +@item model_filename > > +Set path to model file specifying network architecture and its parameters. > > + > > +@item out_fmt > > +the data format of the filter's output. > > + > > +It accepts the following values: > > +@table @samp > > +@item gbrpf32le > > +force gbrpf32le output > > + > > +@item gbrp10le > > +force gbrp10le output > > +@end table > > + > > +Default value is @samp{gbrpf32le}. > > + > > +@end table > > + > > @anchor{selectivecolor} > > @section selectivecolor > > > > diff --git a/libavfilter/Makefile b/libavfilter/Makefile index > > 62cc2f5..88e7da6 > > 100644 > > --- a/libavfilter/Makefile > > +++ b/libavfilter/Makefile > > @@ -360,6 +360,7 @@ OBJS-$(CONFIG_SOBEL_OPENCL_FILTER) += > > vf_convolution_opencl.o opencl.o > > OBJS-$(CONFIG_SPLIT_FILTER) += split.o > > OBJS-$(CONFIG_SPP_FILTER) += vf_spp.o > > OBJS-$(CONFIG_SR_FILTER) += vf_sr.o > > +OBJS-$(CONFIG_SDR2HDR_FILTER) += vf_sdr2hdr.o > > OBJS-$(CONFIG_SSIM_FILTER) += vf_ssim.o framesync.o > > OBJS-$(CONFIG_STEREO3D_FILTER) += vf_stereo3d.o > > OBJS-$(CONFIG_STREAMSELECT_FILTER) += f_streamselect.o > > framesync.o > > diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c index > > 5e72803..1645c0f > > 100644 > > --- a/libavfilter/allfilters.c > > +++ b/libavfilter/allfilters.c > > @@ -319,6 +319,7 @@ extern AVFilter ff_vf_scale_npp; extern AVFilter > > ff_vf_scale_qsv; extern AVFilter ff_vf_scale_vaapi; extern AVFilter > > ff_vf_scale2ref; > > +extern AVFilter ff_vf_sdr2hdr; > > extern AVFilter ff_vf_select; > > extern AVFilter ff_vf_selectivecolor; extern AVFilter ff_vf_sendcmd; > > diff --git a/libavfilter/vf_sdr2hdr.c b/libavfilter/vf_sdr2hdr.c new > > file mode > > 100644 index 0000000..109b907 > > --- /dev/null > > +++ b/libavfilter/vf_sdr2hdr.c > > @@ -0,0 +1,268 @@ > > +/* > > + * Copyright (c) 2018 Guo Yejun > > + * > > + * This file is part of FFmpeg. > > + * > > + * FFmpeg is free software; you can redistribute it and/or > > + * modify it under the terms of the GNU Lesser General Public > > + * License as published by the Free Software Foundation; either > > + * version 2.1 of the License, or (at your option) any later version. > > + * > > + * FFmpeg is distributed in the hope that it will be useful, > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > GNU > > + * Lesser General Public License for more details. > > + * > > + * You should have received a copy of the GNU Lesser General Public > > + * License along with FFmpeg; if not, write to the Free Software > > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA > > +02110-1301 USA */ > > + > > +/** > > + * @file > > + * Filter implementing HDR image generation from a single exposure > > +using > > deep CNNs. > > + * https://arxiv.org/pdf/1710.07480.pdf > > + */ > > + > > +#include "avfilter.h" > > +#include "formats.h" > > +#include "internal.h" > > +#include "libavutil/opt.h" > > +#include "libavutil/qsort.h" > > +#include "libavformat/avio.h" > > +#include "libswscale/swscale.h" > > +#include "dnn_interface.h" > > +#include <math.h> > > + > > +typedef struct SDR2HDRContext { > > + const AVClass *class; > > + > > + char* model_filename; > > + enum AVPixelFormat out_fmt; > > + DNNModule* dnn_module; > > + DNNModel* model; > > + DNNData input, output; > > +} SDR2HDRContext; > > + > > +#define OFFSET(x) offsetof(SDR2HDRContext, x) #define FLAGS > > +AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM > static > > const > > +AVOption sdr2hdr_options[] = { > > + { "model_filename", "path to model file specifying network > > +architecture > > and its parameters", OFFSET(model_filename), AV_OPT_TYPE_STRING, > > {.str=NULL}, 0, 0, FLAGS }, > > + { "out_fmt", "the data format of the filter's output, it could be > > + gbrpf32le > > [default] or gbrp10le", OFFSET(out_fmt), AV_OPT_TYPE_PIXEL_FMT, > > {.i64=AV_PIX_FMT_GBRPF32LE}, AV_PIX_FMT_NONE, AV_PIX_FMT_NB, > FLAGS }, > > + { NULL } > > +}; > > + > > +AVFILTER_DEFINE_CLASS(sdr2hdr); > > + > > +static av_cold int init(AVFilterContext* context) { > > + SDR2HDRContext* ctx = context->priv; > > + > > + if (ctx->out_fmt != AV_PIX_FMT_GBRPF32LE && ctx->out_fmt != > > AV_PIX_FMT_GBRP10LE) { > > + av_log(context, AV_LOG_ERROR, "could not support the output > > format\n"); > > + return AVERROR(ENOSYS); > > + } > > + > > + ctx->dnn_module = ff_get_dnn_module(DNN_TF); > > + if (!ctx->dnn_module){ > > + av_log(context, AV_LOG_ERROR, "could not create DNN module > > + for > > tensorflow backend\n"); > > + return AVERROR(ENOMEM); > > + } > > + if (!ctx->model_filename){ > > + av_log(context, AV_LOG_ERROR, "model file for network was not > > specified\n"); > > + return AVERROR(EIO); > > + } > > + if (!ctx->dnn_module->load_model) { > > + av_log(context, AV_LOG_ERROR, "load_model for network was not > > specified\n"); > > + return AVERROR(EIO); > > + } > > + ctx->model = (ctx->dnn_module->load_model)(ctx->model_filename); > > + if (!ctx->model){ > > + av_log(context, AV_LOG_ERROR, "could not load DNN model\n"); > > + return AVERROR(EIO); > > + } > > + return 0; > > +} > > + > > +static int query_formats(AVFilterContext* context) { > > + const enum AVPixelFormat in_formats[] = {AV_PIX_FMT_RGB24, > > + AV_PIX_FMT_NONE}; > > + enum AVPixelFormat out_formats[2]; > > + SDR2HDRContext* ctx = context->priv; > > + AVFilterFormats* formats_list; > > + int ret = 0; > > + > > + formats_list = ff_make_format_list(in_formats); > > + if ((ret = ff_formats_ref(formats_list, > > + &context->inputs[0]->out_formats)) > > < 0) > > + return ret; > > + > > + out_formats[0] = ctx->out_fmt; > > + out_formats[1] = AV_PIX_FMT_NONE; > > + formats_list = ff_make_format_list(out_formats); > > + if ((ret = ff_formats_ref(formats_list, > > + &context->outputs[0]->in_formats)) > > < 0) > > + return ret; > > + > > + return 0; > > +} > > + > > +static int config_props(AVFilterLink* inlink) { > > + AVFilterContext* context = inlink->dst; > > + SDR2HDRContext* ctx = context->priv; > > + AVFilterLink* outlink = context->outputs[0]; > > + DNNReturnType result; > > + > > + // the dnn model is tied with resolution due to deconv layer of > tensorflow > > + // now just support 1920*1080 and so the magic numbers within this file > > + if (inlink->w != 1920 || inlink->h != 1080) { > > + av_log(context, AV_LOG_ERROR, "only support frame size with > > 1920*1080\n"); > > + return AVERROR(ENOSYS); > > + } > > + > > + ctx->input.width = 1920; > > + ctx->input.height = 1088; //the model requires height is a multiple > > of 32, > > + ctx->input.channels = 3; > > + > > + result = (ctx->model->set_input_output)(ctx->model->model, &ctx- > > >input, &ctx->output); > > + if (result != DNN_SUCCESS){ > > + av_log(context, AV_LOG_ERROR, "could not set input and output > > + for > > the model\n"); > > + return AVERROR(EIO); > > + } > > + > > + memset(ctx->input.data, 0, ctx->input.channels * ctx->input.width > > + * ctx- > > >input.height * sizeof(float)); > > + outlink->h = 1080; > > + outlink->w = 1920; > > + return 0; > > +} > > + > > +static float qsort_comparison_function_float(const void *a, const > > +void > > +*b) { > > + return *(const float *)a - *(const float *)b; } > > + > > +static int filter_frame(AVFilterLink* inlink, AVFrame* in) { > > + DNNReturnType dnn_result = DNN_SUCCESS; > > + AVFilterContext* context = inlink->dst; > > + SDR2HDRContext* ctx = context->priv; > > + AVFilterLink* outlink = context->outputs[0]; > > + AVFrame* out = ff_get_video_buffer(outlink, outlink->w, outlink->h); > > + int total_pixels = in->height * in->width; > > + > > + if (!out){ > > + av_log(context, AV_LOG_ERROR, "could not allocate memory for > > output frame\n"); > > + av_frame_free(&in); > > + return AVERROR(ENOMEM); > > + } > > + > > + av_frame_copy_props(out, in); > > + > > + for (int i = 0; i < in->linesize[0] * in->height; ++i) { > > + ctx->input.data[i] = in->data[0][i] / 255.0f; > > + } > > + > > + dnn_result = (ctx->dnn_module->execute_model)(ctx->model); > > + if (dnn_result != DNN_SUCCESS){ > > + av_log(context, AV_LOG_ERROR, "failed to execute loaded > model\n"); > > + return AVERROR(EIO); > > + } > > + > > + if (ctx->out_fmt == AV_PIX_FMT_GBRPF32LE) { > > + float* outg = (float*)out->data[0]; > > + float* outb = (float*)out->data[1]; > > + float* outr = (float*)out->data[2]; > > + for (int i = 0; i < total_pixels; ++i) { > > + float r = ctx->output.data[i*3]; > > + float g = ctx->output.data[i*3+1]; > > + float b = ctx->output.data[i*3+2]; > > + outr[i] = r; > > + outg[i] = g; > > + outb[i] = b; > > + } > > + } else { > > + // here, we just use a rough mapping to the 10bit contents > > + // meta data generation for HDR video encoding is not supported yet > > + float* converted_data = (float*)av_malloc(total_pixels * 3 * > > sizeof(float)); > > + int16_t* outg = (int16_t*)out->data[0]; > > + int16_t* outb = (int16_t*)out->data[1]; > > + int16_t* outr = (int16_t*)out->data[2]; > > + > > + float max = 1.0f; > > + for (int i = 0; i < total_pixels * 3; ++i) { > > + float d = ctx->output.data[i]; > > + d = sqrt(d); > > + converted_data[i] = d; > > + max = FFMAX(d, max); > > + } > > + > > + if (max > 1.0f) { > > + AV_QSORT(converted_data, total_pixels * 3, float, > > qsort_comparison_function_float); > > + // 0.5% pixels are clipped > > + max = converted_data[(int)(total_pixels * 3 * 0.995)]; > > + max = FFMAX(max, 1.0f); > > + > > + for (int i = 0; i < total_pixels * 3; ++i) { > > + float d = ctx->output.data[i]; > > + d = sqrt(d); > > + d = FFMIN(d, max); > > + converted_data[i] = d; > > + } > > + } > > + > > + for (int i = 0; i < total_pixels; ++i) { > > + float r = converted_data[i*3]; > > + float g = converted_data[i*3+1]; > > + float b = converted_data[i*3+2]; > > + outr[i] = r / max * 1023; > > + outg[i] = g / max * 1023; > > + outb[i] = b / max * 1023; > > + } > > + > > + av_free(converted_data); > > + } > > + > > + av_frame_free(&in); > > + return ff_filter_frame(outlink, out); } > > + > > +static av_cold void uninit(AVFilterContext* context) { > > + SDR2HDRContext* ctx = context->priv; > > + > > + if (ctx->dnn_module){ > > + (ctx->dnn_module->free_model)(&ctx->model); > > + av_freep(&ctx->dnn_module); > > + } > > +} > > + > > +static const AVFilterPad sdr2hdr_inputs[] = { > > + { > > + .name = "default", > > + .type = AVMEDIA_TYPE_VIDEO, > > + .config_props = config_props, > > + .filter_frame = filter_frame, > > + }, > > + { NULL } > > +}; > > + > > +static const AVFilterPad sdr2hdr_outputs[] = { > > + { > > + .name = "default", > > + .type = AVMEDIA_TYPE_VIDEO, > > + }, > > + { NULL } > > +}; > > + > > +AVFilter ff_vf_sdr2hdr = { > > + .name = "sdr2hdr", > > + .description = NULL_IF_CONFIG_SMALL("HDR image generation from a > > single exposure using deep CNNs."), > > + .priv_size = sizeof(SDR2HDRContext), > > + .init = init, > > + .uninit = uninit, > > + .query_formats = query_formats, > > + .inputs = sdr2hdr_inputs, > > + .outputs = sdr2hdr_outputs, > > + .priv_class = &sdr2hdr_class, > > + .flags = AVFILTER_FLAG_SUPPORT_TIMELINE_GENERIC, > > +}; > > -- > > 2.7.4 > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > http://ffmpeg.org/mailman/listinfo/ffmpeg-devel _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel