Em qui., 7 de nov. de 2019 às 13:17, Guo, Yejun <yejun....@intel.com> escreveu:
> > > > From: Pedro Arthur [mailto:bygran...@gmail.com] > > > Sent: Thursday, November 07, 2019 1:18 AM > > > To: FFmpeg development discussions and patches > > <ffmpeg-devel@ffmpeg.org> > > > Cc: Guo, Yejun <yejun....@intel.com> > > > Subject: Re: [FFmpeg-devel] [PATCH V3] avfilter/vf_dnn_processing: add > a > > > generic filter for image proccessing with dnn networks > > > > > > Hi, > > > > > > Em qui., 31 de out. de 2019 às 05:39, Guo, Yejun <yejun....@intel.com> > > > escreveu: > > > This filter accepts all the dnn networks which do image processing. > > > Currently, frame with formats rgb24 and bgr24 are supported. Other > > > formats such as gray and YUV will be supported next. The dnn network > > > can accept data in float32 or uint8 format. And the dnn network can > > > change frame size. > > > > > > The following is a python script to halve the value of the first > > > channel of the pixel. It demos how to setup and execute dnn model > > > with python+tensorflow. It also generates .pb file which will be > > > used by ffmpeg. > > > > > > import tensorflow as tf > > > import numpy as np > > > import scipy.misc > > > in_img = scipy.misc.imread('in.bmp') > > > in_img = in_img.astype(np.float32)/255.0 > > > in_data = in_img[np.newaxis, :] > > > filter_data = np.array([0.5, 0, 0, 0, 1., 0, 0, 0, > > > 1.]).reshape(1,1,3,3).astype(np.float32) > > > filter = tf.Variable(filter_data) > > > x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in') > > > y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID', > > name='dnn_out') > > > sess=tf.Session() > > > sess.run(tf.global_variables_initializer()) > > > output = sess.run(y, feed_dict={x: in_data}) > > > graph_def = tf.graph_util.convert_variables_to_constants(sess, > > > sess.graph_def, ['dnn_out']) > > > tf.train.write_graph(graph_def, '.', 'halve_first_channel.pb', > as_text=False) > > > output = output * 255.0 > > > output = output.astype(np.uint8) > > > scipy.misc.imsave("out.bmp", np.squeeze(output)) > > > > > > To do the same thing with ffmpeg: > > > - generate halve_first_channel.pb with the above script > > > - generate halve_first_channel.model with tools/python/convert.py > > > - try with following commands > > > ./ffmpeg -i input.jpg -vf > > > > > dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_ > > > out:fmt=rgb24:dnn_backend=native -y out.native.png > > > ./ffmpeg -i input.jpg -vf > > > > > dnn_processing=model=halve_first_channel.pb:input=dnn_in:output=dnn_out:f > > > mt=rgb24:dnn_backend=tensorflow -y out.tf.png > > It would be great if you could transform the above steps in a fate test, > that > > way one can automatically ensure the filter is always working properly. > > sure, I'll add a fate test to test this filter with > halve_first_channel.model. There will > be no test for tensorflow part since the fate test requires no external > dependency. > > furthermore, more industry-famous models can be added into this fate test > after we support them by > adding more layers into native mode, and after we optimize the conv2d > layer which is now > very very very very slow. > > > > +}; > > > + > > > +AVFilter ff_vf_dnn_processing = { > > > + .name = "dnn_processing", > > > + .description = NULL_IF_CONFIG_SMALL("Apply DNN processing > > filter > > > to the input."), > > > + .priv_size = sizeof(DnnProcessingContext), > > > + .init = init, > > > + .uninit = uninit, > > > + .query_formats = query_formats, > > > + .inputs = dnn_processing_inputs, > > > + .outputs = dnn_processing_outputs, > > > + .priv_class = &dnn_processing_class, > > > +}; > > > -- > > > 2.7.4 > > rest LGTM. > > thanks, could we first push this patch? > patch pushed, thanks. I slight edited the commit message, changed "scipy.misc" to "imageio" as the former is deprecated and not present in newer versions. > I plan to add two more changes for this filter next: > - add gray8 and gray32 support > - add y_from_yuv support, in other words, the network only handles the Y > channel, > and uv parts are not changed (or just scaled), just like what vf_sr does. > > I currently do not have plan to add specific yuv formats, since I do not > see a famous > network which handles all the y u v channels. > > > > BTW do you have already concrete use cases (or plans) for this filter? > > not yet, the idea of this filter is that it is general for image > processing and should be very useful, > and my basic target is to at least cover the features provided by vf_sr > and vf_derain > > actually, I do have a use case plan for a general video analytic filter, > the side data type might be > a big challenge, I'm still thinking about it. I choose this image > processing filter first because > it is simpler and community can be familiar with dnn based filters step by > step. > > > > > > > > > > _______________________________________________ > > > ffmpeg-devel mailing list > > > ffmpeg-devel@ffmpeg.org > > > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > > > > > To unsubscribe, visit link above, or email > > > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". > _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".