Re: [FFmpeg-devel] [PATCH 2/6] avcodec/nvdec: avoid needless copy of output frame

2018-05-10 Thread Oscar Amoros Huguet
Just want to update, We compiled master branch with this patches, tested and looked at NSIGHT. Everything correct: - The internal NVDEC kernel is using the stream we have set in AVCUDADeviceContext struct. - It slightly overlaps with other kernels, removing wait times between kernel

Re: [FFmpeg-devel] [PATCH 3/6] avutil/hwcontext_cuda: add CUstream in cuda hwctx

2018-05-08 Thread Oscar Amoros Huguet
Hi! Responding the per device question (sorry I can't make it shorter, the topic is quite dense). A typical CUDA application uses a single cuda context, and multiple cuda streams to allow asynchronicity between cuda tasks (memory transfers, kernels, memsets) and make overlapping between those

Re: [FFmpeg-devel] [PATCH] Added the possibility to pass an externally created CUDA context to libavutil/hwcontext.c/av_hwdevice_ctx_create() for decoding with NVDEC

2018-05-08 Thread Oscar Amoros Huguet
ontext to libavutil/hwcontext.c/av_hwdevice_ctx_create() for decoding with NVDEC Am 07.05.2018 um 19:37 schrieb Oscar Amoros Huguet: > I was looking at the NVIDIA Video codec sdk samples > (https://developer.nvidia.com/nvidia-video-codec-sdk#Download), where you can > find the header N

Re: [FFmpeg-devel] [PATCH] Added the possibility to pass an externally created CUDA context to libavutil/hwcontext.c/av_hwdevice_ctx_create() for decoding with NVDEC

2018-05-07 Thread Oscar Amoros Huguet
:25 PM To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH] Added the possibility to pass an externally created CUDA context to libavutil/hwcontext.c/av_hwdevice_ctx_create() for decoding with NVDEC On 26.04.2018 18:03, Oscar Amoros Huguet wrote: > Thanks Mark, > > You ar

Re: [FFmpeg-devel] [PATCH] Added the possibility to pass an externally created CUDA context to libavutil/hwcontext.c/av_hwdevice_ctx_create() for decoding with NVDEC

2018-05-07 Thread Oscar Amoros Huguet
Hi! Even if there is need to have a syncronization before leaving the ffmpeg call, callin cuMemcpyAsync will allow the copies to overlap with any other task on the gpu, that was enqueued using any other non-blocking cuda stream. That’s exactly what we want to achieve. This would benefit

Re: [FFmpeg-devel] [PATCH] Added the possibility to pass an externally created CUDA context to libavutil/hwcontext.c/av_hwdevice_ctx_create() for decoding with NVDEC

2018-05-07 Thread Oscar Amoros Huguet
rg> va > escriure: > > Am 07.05.2018 um 18:25 schrieb Oscar Amoros Huguet: >> Have a look at this, looks pretty interesting: >> /** >> * @brief This function decodes a frame and returns the locked frame >> buffers >> * This makes the buff

Re: [FFmpeg-devel] [PATCH] Added the possibility to pass an externally created CUDA context to libavutil/hwcontext.c/av_hwdevice_ctx_create() for decoding with NVDEC

2018-05-07 Thread Oscar Amoros Huguet
Message- From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of Oscar Amoros Huguet Sent: Monday, May 7, 2018 6:21 PM To: FFmpeg development discussions and patches <ffmpeg-devel@ffmpeg.org> Subject: Re: [FFmpeg-devel] [PATCH] Added the possibility to pass an externally

Re: [FFmpeg-devel] [PATCH] Added the possibility to pass an externally created CUDA context to libavutil/hwcontext.c/av_hwdevice_ctx_create() for decoding with NVDEC

2018-05-07 Thread Oscar Amoros Huguet
fmpeg.org> On Behalf Of Timo Rothenpieler Sent: Monday, May 7, 2018 5:13 PM To: ffmpeg-devel@ffmpeg.org Subject: Re: [FFmpeg-devel] [PATCH] Added the possibility to pass an externally created CUDA context to libavutil/hwcontext.c/av_hwdevice_ctx_create() for decoding with NVDEC Am 07.05.2018 um 17

Re: [FFmpeg-devel] [PATCH] Added the possibility to pass an externally created CUDA context to libavutil/hwcontext.c/av_hwdevice_ctx_create() for decoding with NVDEC

2018-05-07 Thread Oscar Amoros Huguet
ication, he can set it's own cuda context, and it's own non-default stream. In any of the cases, ffmpeg does not have to handle cuda stream creation and destruction, which makes it simpler. Hope you like it! Oscar -Original Message- From: Oscar Amoros Huguet Sent: Monday, May 7, 201

Re: [FFmpeg-devel] [PATCH] Added the possibility to pass an externally created CUDA context to libavutil/hwcontext.c/av_hwdevice_ctx_create() for decoding with NVDEC

2018-04-26 Thread Oscar Amoros Huguet
Thanks Mark, You are right, we can implement in our code a sort of "av_hwdevice_ctx_set" (which does not exist), by using av_hwdevice_ctx_alloc() + av_hwdevice_ctx_init(). We actually use av_hwdevice_ctx_alloc in our code to use the feature we implemented already. We are not sure about license

[FFmpeg-devel] [PATCH] Added the possibility to pass an externally created CUDA context to libavutil/hwcontext.c/av_hwdevice_ctx_create() for decoding with NVDEC

2018-04-20 Thread Oscar Amoros Huguet
Hi! We changed 4 files in ffmpeg, libavcodec/nvdec.c, libavutil/hwcontext.c, libavutil/hwcontext_cuda.h, libavutil/hwcontext_cuda.c. The purpose of this modification is very simple. We needed, for performance reasons (per frame execution time), that nvdec.c used the same CUDA context as we