Hi Rob, Can you share the test cases and environment set up details as soon as possible where you observed the issue that could be helpful to verify your patches ?
On Wed, 8 Jun, 2022, 22:14 Rob Arrow, <rob.ar...@v-nova.com> wrote: > Hi, > > Sorry if you've already received this, I originally submitted the > following contribution to x265contributi...@multicorewareinc.com, but I > received no response so am trying again here. > > Thanks, > Rob > > ~~~ > > I have identified a deadlock which occurs occasionally with x265. > Unfortunately, the deadlock is difficult to reproduce reliably so I am > unable to provide steps to reproduce. I have included a patch in the body > below to resolve the deadlock and a description of the conditions leading > to the deadlock is included the commit message for the patch. > > During the investigation of this bug, it was also revealed that there were > unprotected mutations of `CTURow.active` and `CTURow.busy`. The proposed > patch adds protection to these mutations by taking a `ScopedLock` before > accessing them. > > <http://www.v-nova.com/> > <https://www.v-nova.com/perseus-video-compression-technology/#whatis> > > > *Rob Arrow*Software Engineer > > 20 Eastbourne Terrace > London W2 6LG > www.v-nova.com > > <https://www.linkedin.com/company/2760764/admin/> > <https://twitter.com/VNovaVideo> > > This email and any attachments are sent in strictest confidence for the > sole use of the addressee and may contain legally privileged, confidential, > proprietary and personal data. If you are not the intended recipient, > please advise the sender by replying promptly to this email and then delete > and destroy this email and any attachments without any further use, copying > or forwarding. > > For more information on how we use and protect your information, please > review our Privacy Policy here <https://www.v-nova.com/privacy-policy/> > > Please consider your environmental responsibility before printing this > e-mail > From: Rob Arrow <rob.ar...@v-nova.com> > Date: Thu, 16 Dec 2021 13:30:20 +0000 > Subject: [PATCH 1/2] Protect access to CTURow.active CTURow.busy > > The comments in frameencoder.h state that the CTURow.lock "must be > acquired when reading or writing m_active or m_busy". "Previously > there were two points in the code where CTURow.active and CTURow.busy > were being modified without being protected by the CTURow.lock mutex. > This commit protects the modifications by taking the lock. > --- > source/encoder/frameencoder.cpp | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/source/encoder/frameencoder.cpp > b/source/encoder/frameencoder.cpp > index 1b3875a25..e0c3c55b8 100644 > --- a/source/encoder/frameencoder.cpp > +++ b/source/encoder/frameencoder.cpp > @@ -818,8 +818,12 @@ void FrameEncoder::compressFrame() > * compressed in a wave-front pattern if WPP is enabled. Row based > loop > * filters runs behind the CTU compression and reconstruction */ > > - for (uint32_t sliceId = 0; sliceId < m_param->maxSlices; sliceId++) > > - m_rows[m_sliceBaseRow[sliceId]].active = true; > + for (uint32_t sliceId = 0; sliceId < m_param->maxSlices; sliceId++) > + { > + CTURow& curRow = m_rows[m_sliceBaseRow[sliceId]]; > + ScopedLock self(curRow.lock); > + curRow.active = true; > + } > > if (m_param->bEnableWavefront) > { > @@ -1909,6 +1913,7 @@ void FrameEncoder::processRowEncoder(int intRow, > ThreadLocalData& tld) > } > } > > + ScopedLock self(curRow.lock); > curRow.busy = false; > > // CHECK_ME: Does it always FALSE condition? > -- > 2.25.1 > > From caaf4d9d4ee86cf5479b47cbc0e94dca3fe626a1 Mon Sep 17 00:00:00 2001 > From: Rob Arrow <rob.ar...@v-nova.com> > Date: Thu, 16 Dec 2021 13:33:58 +0000 > Subject: [PATCH 2/2] Prevent VBV underflow deadlock > > Previously, an issue could occur within `frameencoder.cpp` when a VBV > underflow was detected that requires the encode to be restarted in one > thread (thread A). If the last row of the slice has been marked enqueued > and completed all of the CUs in another thread (thread B) before thread A > sets m_vbvResetTriggerRow and m_bAllRowsStop, it is possible for a deadlock > to occur within thread A this loop in `frameencoder.cpp`: > > ``` > while (stopRow.active) > { > if (dequeueRow(m_row_to_idx[r] * 2)) > stopRow.active = false; > else > { > /* we must release the row lock to allow the thread to exit */ > stopRow.lock.release(); > GIVE_UP_TIME(); > stopRow.lock.acquire(); > } > } > ``` > > In this case, stopRow.active was set by the row above in thread B. The > only way for thread A to exit the while loop is for `stopRow.active` to be > cleared elsewhere or for `dequeueRow()` to return true. `dequeueRow()` is > unable to return true as thead B has already removed the row from the queue > when it was processed by `WaveFront::findJob()`. > > There was previously no code path available for `stopRow.active` to be > cleared in another thread to prevent this deadlock occurring. > > This commit clears the active variable when the row is completed at the > end of `FrameEncoder::processRowEncoder()` in thread B. > --- > source/encoder/frameencoder.cpp | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/source/encoder/frameencoder.cpp > b/source/encoder/frameencoder.cpp > index e0c3c55b8..3ea281da1 100644 > --- a/source/encoder/frameencoder.cpp > +++ b/source/encoder/frameencoder.cpp > @@ -1915,6 +1915,7 @@ void FrameEncoder::processRowEncoder(int intRow, > ThreadLocalData& tld) > > ScopedLock self(curRow.lock); > curRow.busy = false; > + curRow.active = false; > > // CHECK_ME: Does it always FALSE condition? > if (ATOMIC_INC(&m_completionCount) == 2 * (int)m_numRows) > -- > 2.25.1 > > _______________________________________________ > x265-devel mailing list > x265-devel@videolan.org > https://mailman.videolan.org/listinfo/x265-devel >
thumbnail_bfa8ec31-1546-45d1-990c-0de39ea526e7.jfif
Description: Binary data
_______________________________________________ x265-devel mailing list x265-devel@videolan.org https://mailman.videolan.org/listinfo/x265-devel