Re: [FFmpeg-devel] [PATCH v2] libavfi/dnn: add LibTorch as one of DNN backend

2024-02-19 Thread Chen, Wenbin
> > Hello,
> >
> > On Fri, 2 Feb 2024, at 08:26, wenbin.chen-at-intel@ffmpeg.org wrote:
> > > +static void infer_completion_callback(void *args) {
> > > +THRequestItem *request = (THRequestItem*)args;
> > > +LastLevelTaskItem *lltask = request->lltask;
> > > +TaskItem *task = lltask->task;
> > > +DNNData outputs = { 0 };
> > > +THInferRequest *infer_request = request->infer_request;
> > > +THModel *th_model = (THModel *)task->model;
> > > +torch::Tensor *output = infer_request->output;
> > > +
> > > +c10::IntArrayRef sizes = output->sizes();
> > > +assert(sizes.size == 5);
> >
> > Why 5?
> 
> 5 means 5 channels: [batch_size, frame_number, channel, height, width]

Sorry, I mean 5 dimensions.

> I only add video SR support, so it only support this type of data for now.
> I will change the code to be more easy to read.
> 
> >
> > > +outputs.order = DCO_RGB;
> > > +outputs.layout = DL_NCHW;
> > > +outputs.dims[2] = sizes.at(3);
> > > +outputs.dims[3] = sizes.at(4);
> > > +outputs.dt = DNN_FLOAT;
> > > +outputs.dims[1] = 3;
> >
> > Why 3?
> 
> It is RGB so the channel is 3, but I should use sizes.at(2) instead of a magic
> number.
> Thanks for pointing it out. I will update it in patch v3.
> 
> >
> >
> > --
> > Jean-Baptiste Kempf -  President
> > +33 672 704 734
> > ___
> > ffmpeg-devel mailing list
> > ffmpeg-devel@ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2] libavfi/dnn: add LibTorch as one of DNN backend

2024-02-19 Thread Chen, Wenbin
> Hello,
> 
> On Fri, 2 Feb 2024, at 08:26, wenbin.chen-at-intel@ffmpeg.org wrote:
> > +static void infer_completion_callback(void *args) {
> > +THRequestItem *request = (THRequestItem*)args;
> > +LastLevelTaskItem *lltask = request->lltask;
> > +TaskItem *task = lltask->task;
> > +DNNData outputs = { 0 };
> > +THInferRequest *infer_request = request->infer_request;
> > +THModel *th_model = (THModel *)task->model;
> > +torch::Tensor *output = infer_request->output;
> > +
> > +c10::IntArrayRef sizes = output->sizes();
> > +assert(sizes.size == 5);
> 
> Why 5?

5 means 5 channels: [batch_size, frame_number, channel, height, width]
I only add video SR support, so it only support this type of data for now.
I will change the code to be more easy to read.

> 
> > +outputs.order = DCO_RGB;
> > +outputs.layout = DL_NCHW;
> > +outputs.dims[2] = sizes.at(3);
> > +outputs.dims[3] = sizes.at(4);
> > +outputs.dt = DNN_FLOAT;
> > +outputs.dims[1] = 3;
> 
> Why 3?

It is RGB so the channel is 3, but I should use sizes.at(2) instead of a magic 
number.
Thanks for pointing it out. I will update it in patch v3.

> 
> 
> --
> Jean-Baptiste Kempf -  President
> +33 672 704 734
> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> 
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2] libavfi/dnn: add LibTorch as one of DNN backend

2024-02-13 Thread Jean-Baptiste Kempf
Hello,

On Fri, 2 Feb 2024, at 08:26, wenbin.chen-at-intel@ffmpeg.org wrote:
> +static void infer_completion_callback(void *args) {
> +THRequestItem *request = (THRequestItem*)args;
> +LastLevelTaskItem *lltask = request->lltask;
> +TaskItem *task = lltask->task;
> +DNNData outputs = { 0 };
> +THInferRequest *infer_request = request->infer_request;
> +THModel *th_model = (THModel *)task->model;
> +torch::Tensor *output = infer_request->output;
> +
> +c10::IntArrayRef sizes = output->sizes();
> +assert(sizes.size == 5);

Why 5?

> +outputs.order = DCO_RGB;
> +outputs.layout = DL_NCHW;
> +outputs.dims[2] = sizes.at(3);
> +outputs.dims[3] = sizes.at(4);
> +outputs.dt = DNN_FLOAT;
> +outputs.dims[1] = 3;

Why 3?


-- 
Jean-Baptiste Kempf -  President
+33 672 704 734
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2] libavfi/dnn: add LibTorch as one of DNN backend

2024-02-09 Thread Cosmin Stejerean via ffmpeg-devel


> On Feb 1, 2024, at 11:26 PM, wenbin.chen-at-intel@ffmpeg.org wrote:
> 
> From: Wenbin Chen 
> 
> PyTorch is an open source machine learning framework that accelerates
> the path from research prototyping to production deployment. Official
> websit: https://pytorch.org/. We call the C++ library of PyTorch as
> LibTorch, the same below.
> 
> To build FFmpeg with LibTorch, please take following steps as reference:
> 1. download LibTorch C++ library in https://pytorch.org/get-started/locally/,
> please select C++/Java for language, and other options as your need.
> 2. unzip the file to your own dir, with command
> unzip libtorch-shared-with-deps-latest.zip -d your_dir
> 3. export libtorch_root/libtorch/include and
> libtorch_root/libtorch/include/torch/csrc/api/include to $PATH
> export libtorch_root/libtorch/lib/ to $LD_LIBRARY_PATH
> 4. config FFmpeg with ../configure --enable-libtorch 
> --extra-cflag=-I/libtorch_root/libtorch/include 
> --extra-cflag=-I/libtorch_root/libtorch/include/torch/csrc/api/include 
> --extra-ldflags=-L/libtorch_root/libtorch/lib/
> 5. make
> 
> To run FFmpeg DNN inference with LibTorch backend:
> ./ffmpeg -i input.jpg -vf 
> dnn_processing=dnn_backend=torch:model=LibTorch_model.pt -y output.jpg
> The LibTorch_model.pt can be generated by Python with torch.jit.script() api. 
> Please note, torch.jit.trace() is not recommanded, since it does not support 
> ambiguous input size.
> 
> Signed-off-by: Ting Fu 
> Signed-off-by: Wenbin Chen 

Is there any feedback on this patch? It would be great to get this in before 
7.0. 

- Cosmin


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v2] libavfi/dnn: add LibTorch as one of DNN backend

2024-02-01 Thread wenbin . chen-at-intel . com
From: Wenbin Chen 

PyTorch is an open source machine learning framework that accelerates
the path from research prototyping to production deployment. Official
websit: https://pytorch.org/. We call the C++ library of PyTorch as
LibTorch, the same below.

To build FFmpeg with LibTorch, please take following steps as reference:
1. download LibTorch C++ library in https://pytorch.org/get-started/locally/,
please select C++/Java for language, and other options as your need.
2. unzip the file to your own dir, with command
unzip libtorch-shared-with-deps-latest.zip -d your_dir
3. export libtorch_root/libtorch/include and
libtorch_root/libtorch/include/torch/csrc/api/include to $PATH
export libtorch_root/libtorch/lib/ to $LD_LIBRARY_PATH
4. config FFmpeg with ../configure --enable-libtorch 
--extra-cflag=-I/libtorch_root/libtorch/include 
--extra-cflag=-I/libtorch_root/libtorch/include/torch/csrc/api/include 
--extra-ldflags=-L/libtorch_root/libtorch/lib/
5. make

To run FFmpeg DNN inference with LibTorch backend:
./ffmpeg -i input.jpg -vf 
dnn_processing=dnn_backend=torch:model=LibTorch_model.pt -y output.jpg
The LibTorch_model.pt can be generated by Python with torch.jit.script() api. 
Please note, torch.jit.trace() is not recommanded, since it does not support 
ambiguous input size.

Signed-off-by: Ting Fu 
Signed-off-by: Wenbin Chen 
---
 configure |   5 +-
 libavfilter/dnn/Makefile  |   1 +
 libavfilter/dnn/dnn_backend_torch.cpp | 587 ++
 libavfilter/dnn/dnn_interface.c   |   5 +
 libavfilter/dnn_filter_common.c   |  15 +-
 libavfilter/dnn_interface.h   |   2 +-
 libavfilter/vf_dnn_processing.c   |   3 +
 7 files changed, 614 insertions(+), 4 deletions(-)
 create mode 100644 libavfilter/dnn/dnn_backend_torch.cpp

diff --git a/configure b/configure
index 68f675a4bc..bc11172fe4 100755
--- a/configure
+++ b/configure
@@ -279,6 +279,7 @@ External library support:
   --enable-libtheora   enable Theora encoding via libtheora [no]
   --enable-libtls  enable LibreSSL (via libtls), needed for https 
support
if openssl, gnutls or mbedtls is not used [no]
+  --enable-libtorchenable Torch as one DNN backend [no]
   --enable-libtwolame  enable MP2 encoding via libtwolame [no]
   --enable-libuavs3d   enable AVS3 decoding via libuavs3d [no]
   --enable-libv4l2 enable libv4l2/v4l-utils [no]
@@ -1901,6 +1902,7 @@ EXTERNAL_LIBRARY_LIST="
 libtensorflow
 libtesseract
 libtheora
+libtorch
 libtwolame
 libuavs3d
 libv4l2
@@ -2776,7 +2778,7 @@ cbs_vp9_select="cbs"
 deflate_wrapper_deps="zlib"
 dirac_parse_select="golomb"
 dovi_rpu_select="golomb"
-dnn_suggest="libtensorflow libopenvino"
+dnn_suggest="libtensorflow libopenvino libtorch"
 dnn_deps="avformat swscale"
 error_resilience_select="me_cmp"
 evcparse_select="golomb"
@@ -6873,6 +6875,7 @@ enabled libtensorflow && require libtensorflow 
tensorflow/c/c_api.h TF_Versi
 enabled libtesseract  && require_pkg_config libtesseract tesseract 
tesseract/capi.h TessBaseAPICreate
 enabled libtheora && require libtheora theora/theoraenc.h th_info_init 
-ltheoraenc -ltheoradec -logg
 enabled libtls&& require_pkg_config libtls libtls tls.h 
tls_configure
+enabled libtorch  && check_cxxflags -std=c++14 && require_cpp libtorch 
torch/torch.h "torch::Tensor" -ltorch -lc10 -ltorch_cpu -lstdc++ -lpthread
 enabled libtwolame&& require libtwolame twolame.h twolame_init 
-ltwolame &&
  { check_lib libtwolame twolame.h 
twolame_encode_buffer_float32_interleaved -ltwolame ||
die "ERROR: libtwolame must be installed and 
version must be >= 0.3.10"; }
diff --git a/libavfilter/dnn/Makefile b/libavfilter/dnn/Makefile
index 5d5697ea42..3d09927c98 100644
--- a/libavfilter/dnn/Makefile
+++ b/libavfilter/dnn/Makefile
@@ -6,5 +6,6 @@ OBJS-$(CONFIG_DNN)   += 
dnn/dnn_backend_common.o
 
 DNN-OBJS-$(CONFIG_LIBTENSORFLOW) += dnn/dnn_backend_tf.o
 DNN-OBJS-$(CONFIG_LIBOPENVINO)   += dnn/dnn_backend_openvino.o
+DNN-OBJS-$(CONFIG_LIBTORCH)  += dnn/dnn_backend_torch.o
 
 OBJS-$(CONFIG_DNN)   += $(DNN-OBJS-yes)
diff --git a/libavfilter/dnn/dnn_backend_torch.cpp 
b/libavfilter/dnn/dnn_backend_torch.cpp
new file mode 100644
index 00..b905c55175
--- /dev/null
+++ b/libavfilter/dnn/dnn_backend_torch.cpp
@@ -0,0 +1,587 @@
+/*
+ * Copyright (c) 2024
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even