Re: [FFmpeg-devel] [PATCH 1/2] dnn/native: add native support for avg_pool
> -Original Message- > From: ffmpeg-devel On Behalf Of Guo, > Yejun > Sent: Monday, July 20, 2020 01:46 PM > To: FFmpeg development discussions and patches > Subject: Re: [FFmpeg-devel] [PATCH 1/2] dnn/native: add native support for > avg_pool > > > > > -Original Message- > > From: ffmpeg-devel On Behalf Of Ting > > Fu > > Sent: 2020年7月17日 23:23 > > To: ffmpeg-devel@ffmpeg.org > > Subject: [FFmpeg-devel] [PATCH 1/2] dnn/native: add native support for > > avg_pool > > > > It can be tested with the model generated with below python script: > > > > import tensorflow as tf > > import numpy as np > > import imageio > > > > in_img = imageio.imread('input_odd.jpg') in_img = > > in_img.astype(np.float32)/255.0 in_data = in_img[np.newaxis, :] > > > > x = tf.placeholder(tf.float32, shape=[1, None, None, 3], > > name='dnn_in') x_pool = tf.nn.avg_pool(x, ksize=[1,2,2,1], > > strides=[1,2,2,1], padding='SAME') #please alter the params as needed > > y = tf.identity(x_pool, name='dnn_out') > > > > sess=tf.Session() > > sess.run(tf.global_variables_initializer()) > > > > graph_def = tf.graph_util.convert_variables_to_constants(sess, > > sess.graph_def, > > ['dnn_out']) > > tf.train.write_graph(graph_def, '.', 'image_process.pb', > > as_text=False) > > > > print("image_process.pb generated, please use \ > > path_to_ffmpeg/tools/python/convert.py to generate > > image_process.model\n") > > > > output = sess.run(y, feed_dict={x: in_data}) imageio.imsave("out.jpg", > > np.squeeze(output)) > > > > Signed-off-by: Ting Fu > > --- > > libavfilter/dnn/Makefile | 1 + > > libavfilter/dnn/dnn_backend_native.h | 2 + > > .../dnn/dnn_backend_native_layer_avgpool.c| 136 ++ > > .../dnn/dnn_backend_native_layer_avgpool.h| 35 + > > .../dnn/dnn_backend_native_layer_conv2d.h | 3 +- > > libavfilter/dnn/dnn_backend_native_layers.c | 2 + > > tools/python/convert_from_tensorflow.py | 31 +++- > > 7 files changed, 207 insertions(+), 3 deletions(-) create mode > > 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.c > > create mode 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.h > > [...] > > +int32_t input_operand_index = input_operand_indexes[0]; > > +int number = operands[input_operand_index].dims[0]; > > +int height = operands[input_operand_index].dims[1]; > > +int width = operands[input_operand_index].dims[2]; > > +int channel = operands[input_operand_index].dims[3]; > > the input channel should come from here, not in AvgPoolParams. > And so as output channel. HI Yejun, I got it that the in_channel should come from here. Does the 'so as output channel' mean out_channel = in_channel here (since the pooling of channel is not supported)? > > > +const float *input = operands[input_operand_index].data; > > +const AvgPoolParams *avgpool_params = (const AvgPoolParams > > *)parameters; > > + > > +float kernel_strides = avgpool_params->strides; > > why float? In order to calculate height/kernel_strides with float output in following ceil(). Or should I multiply kernel_strides with 1.0 when using ceil function? > > > +int src_linesize = width * avgpool_params->in_channels; > > +DnnOperand *output_operand = [output_operand_index]; > > + > > +if (avgpool_params->padding_method == SAME) { > > +height_end = height; > > +width_end = width; > > +height_radius = (avgpool_params->kernel_size - ((height - 1) > > + % (int) > > kernel_strides + 1)); > > don't need the first '(' and last ')'. OK > > why we need to consider kernel_strides here? Because when padding_method=SAME, the tensorflow will only padding the half number of 0 pixels except the remainders. Eg: if the width is 1080, strides=11, so the 1080%11=2 And if ksize=5, it will fill (5-2)>>1=1 column before image and 2 columns after the image. And if ksize=2, so 2-2=0, so the remainder pixels just meet the need of calculating one time pooling, so no 0 pixels will be filled. Which means the numbers of filling 0-pixels rely on the remainder-pixels. Does the example make any sense? > > > +width_radius = (avgpool_params->kernel_size - ((width - 1) % > > + (int) > > kernel_strides + 1)); > > same as above. > > > +height_radius = height_radius < 0 ? 0 : height_radi
Re: [FFmpeg-devel] [PATCH 1/2] dnn/native: add native support for avg_pool
> -Original Message- > From: ffmpeg-devel On Behalf Of Ting Fu > Sent: 2020年7月17日 23:23 > To: ffmpeg-devel@ffmpeg.org > Subject: [FFmpeg-devel] [PATCH 1/2] dnn/native: add native support for > avg_pool > > It can be tested with the model generated with below python script: > > import tensorflow as tf > import numpy as np > import imageio > > in_img = imageio.imread('input_odd.jpg') > in_img = in_img.astype(np.float32)/255.0 > in_data = in_img[np.newaxis, :] > > x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in') > x_pool = tf.nn.avg_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME') > #please alter the params as needed > y = tf.identity(x_pool, name='dnn_out') > > sess=tf.Session() > sess.run(tf.global_variables_initializer()) > > graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, > ['dnn_out']) > tf.train.write_graph(graph_def, '.', 'image_process.pb', as_text=False) > > print("image_process.pb generated, please use \ > path_to_ffmpeg/tools/python/convert.py to generate image_process.model\n") > > output = sess.run(y, feed_dict={x: in_data}) > imageio.imsave("out.jpg", np.squeeze(output)) > > Signed-off-by: Ting Fu > --- > libavfilter/dnn/Makefile | 1 + > libavfilter/dnn/dnn_backend_native.h | 2 + > .../dnn/dnn_backend_native_layer_avgpool.c| 136 ++ > .../dnn/dnn_backend_native_layer_avgpool.h| 35 + > .../dnn/dnn_backend_native_layer_conv2d.h | 3 +- > libavfilter/dnn/dnn_backend_native_layers.c | 2 + > tools/python/convert_from_tensorflow.py | 31 +++- > 7 files changed, 207 insertions(+), 3 deletions(-) > create mode 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.c > create mode 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.h > > diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile > index d90137ec42..e0957073ee 100644 > --- a/libavfilter/dnn/Makefile > +++ b/libavfilter/dnn/Makefile > @@ -1,6 +1,7 @@ > OBJS-$(CONFIG_DNN) += > dnn/dnn_interface.o > OBJS-$(CONFIG_DNN) += > dnn/dnn_backend_native.o > OBJS-$(CONFIG_DNN) += > dnn/dnn_backend_native_layers.o > +OBJS-$(CONFIG_DNN) += > dnn/dnn_backend_native_layer_avgpool.o > OBJS-$(CONFIG_DNN) += > dnn/dnn_backend_native_layer_pad.o > OBJS-$(CONFIG_DNN) += > dnn/dnn_backend_native_layer_conv2d.o > OBJS-$(CONFIG_DNN) += > dnn/dnn_backend_native_layer_depth2space.o > diff --git a/libavfilter/dnn/dnn_backend_native.h > b/libavfilter/dnn/dnn_backend_native.h > index 62191ffe88..26e9a33387 100644 > --- a/libavfilter/dnn/dnn_backend_native.h > +++ b/libavfilter/dnn/dnn_backend_native.h > @@ -43,10 +43,12 @@ typedef enum { > DLT_MAXIMUM = 4, > DLT_MATH_BINARY = 5, > DLT_MATH_UNARY = 6, > +DLT_AVG_POOL = 7, > DLT_COUNT > } DNNLayerType; > > typedef enum {DOT_INPUT = 1, DOT_OUTPUT = 2, DOT_INTERMEDIATE = > DOT_INPUT | DOT_OUTPUT} DNNOperandType; > +typedef enum {VALID, SAME, SAME_CLAMP_TO_EDGE} DNNPaddingParam; > > typedef struct Layer{ > DNNLayerType type; > diff --git a/libavfilter/dnn/dnn_backend_native_layer_avgpool.c > b/libavfilter/dnn/dnn_backend_native_layer_avgpool.c > new file mode 100644 > index 00..f5a3f4a0dc > --- /dev/null > +++ b/libavfilter/dnn/dnn_backend_native_layer_avgpool.c > @@ -0,0 +1,136 @@ > +/* > + * Copyright (c) 2020 > + * > + * This file is part of FFmpeg. > + * > + * FFmpeg is free software; you can redistribute it and/or > + * modify it under the terms of the GNU Lesser General Public > + * License as published by the Free Software Foundation; either > + * version 2.1 of the License, or (at your option) any later version. > + * > + * FFmpeg is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + * Lesser General Public License for more details. > + * > + * You should have received a copy of the GNU Lesser General Public > + * License along with FFmpeg; if not, write to the Free Software > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 > USA > + */ > + > +/** > + * @file > + * DNN native backend implementation. > + */ > + > +#include "libavutil/avassert.h" > +#include "dnn_backend_native_layer_avgpool.h" > + > +int dnn_load_layer_avg_pool(Layer
[FFmpeg-devel] [PATCH 1/2] dnn/native: add native support for avg_pool
It can be tested with the model generated with below python script: import tensorflow as tf import numpy as np import imageio in_img = imageio.imread('input_odd.jpg') in_img = in_img.astype(np.float32)/255.0 in_data = in_img[np.newaxis, :] x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in') x_pool = tf.nn.avg_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME') #please alter the params as needed y = tf.identity(x_pool, name='dnn_out') sess=tf.Session() sess.run(tf.global_variables_initializer()) graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, ['dnn_out']) tf.train.write_graph(graph_def, '.', 'image_process.pb', as_text=False) print("image_process.pb generated, please use \ path_to_ffmpeg/tools/python/convert.py to generate image_process.model\n") output = sess.run(y, feed_dict={x: in_data}) imageio.imsave("out.jpg", np.squeeze(output)) Signed-off-by: Ting Fu --- libavfilter/dnn/Makefile | 1 + libavfilter/dnn/dnn_backend_native.h | 2 + .../dnn/dnn_backend_native_layer_avgpool.c| 136 ++ .../dnn/dnn_backend_native_layer_avgpool.h| 35 + .../dnn/dnn_backend_native_layer_conv2d.h | 3 +- libavfilter/dnn/dnn_backend_native_layers.c | 2 + tools/python/convert_from_tensorflow.py | 31 +++- 7 files changed, 207 insertions(+), 3 deletions(-) create mode 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.c create mode 100644 libavfilter/dnn/dnn_backend_native_layer_avgpool.h diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile index d90137ec42..e0957073ee 100644 --- a/libavfilter/dnn/Makefile +++ b/libavfilter/dnn/Makefile @@ -1,6 +1,7 @@ OBJS-$(CONFIG_DNN) += dnn/dnn_interface.o OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native.o OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layers.o +OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layer_avgpool.o OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layer_pad.o OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layer_conv2d.o OBJS-$(CONFIG_DNN) += dnn/dnn_backend_native_layer_depth2space.o diff --git a/libavfilter/dnn/dnn_backend_native.h b/libavfilter/dnn/dnn_backend_native.h index 62191ffe88..26e9a33387 100644 --- a/libavfilter/dnn/dnn_backend_native.h +++ b/libavfilter/dnn/dnn_backend_native.h @@ -43,10 +43,12 @@ typedef enum { DLT_MAXIMUM = 4, DLT_MATH_BINARY = 5, DLT_MATH_UNARY = 6, +DLT_AVG_POOL = 7, DLT_COUNT } DNNLayerType; typedef enum {DOT_INPUT = 1, DOT_OUTPUT = 2, DOT_INTERMEDIATE = DOT_INPUT | DOT_OUTPUT} DNNOperandType; +typedef enum {VALID, SAME, SAME_CLAMP_TO_EDGE} DNNPaddingParam; typedef struct Layer{ DNNLayerType type; diff --git a/libavfilter/dnn/dnn_backend_native_layer_avgpool.c b/libavfilter/dnn/dnn_backend_native_layer_avgpool.c new file mode 100644 index 00..f5a3f4a0dc --- /dev/null +++ b/libavfilter/dnn/dnn_backend_native_layer_avgpool.c @@ -0,0 +1,136 @@ +/* + * Copyright (c) 2020 + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +/** + * @file + * DNN native backend implementation. + */ + +#include "libavutil/avassert.h" +#include "dnn_backend_native_layer_avgpool.h" + +int dnn_load_layer_avg_pool(Layer *layer, AVIOContext *model_file_context, int file_size, int operands_num) +{ +AvgPoolParams *avgpool_params; +int dnn_size = 0; +avgpool_params = av_malloc(sizeof(*avgpool_params)); +if(!avgpool_params) +return 0; + +avgpool_params->strides = (int32_t)avio_rl32(model_file_context); +avgpool_params->padding_method = (int32_t)avio_rl32(model_file_context); +avgpool_params->in_channels = (int32_t)avio_rl32(model_file_context); +avgpool_params->out_channels = (int32_t)avio_rl32(model_file_context); +avgpool_params->kernel_size = (int32_t)avio_rl32(model_file_context); +dnn_size += 20; + +if (dnn_size > file_size || avgpool_params->in_channels <= 0 || +avgpool_params->out_channels <= 0 || avgpool_params->kernel_size <= 0 || +avgpool_params->strides <=0){ +