Hi:
There are some news about this question. The new code as below, I
change from __sync_fetch_and_add to pthread_mutex_xxx
pthread_mutex_lock(&g_mutex);
int curr = m_nCurrent;
m_nCurrent += m_nStep;
pthread_mutex_unlock(&g_mutex);
Now there is no testcases with valgrind running too long, and failed.
But pthread_mutex_lock is not efficient as __sync_fetch_and_add, so
the pthread_mutex_lock is just for now, for testing.
And I think there is something related to schedule module of valgrind .
why it last too long?
BR
Owen
-----邮件原件-----
发件人: John Reiser [mailto:[email protected]]
发送时间: 2018年1月26日 12:44
收件人: [email protected]
主题: Re: [Valgrind-users] 答复: 答复: 答复: [Help] Valgrind sometime run the program
very slowly sometimes , it last at least one hour. can you show me why or some
way to analyze it?
On 01/25/2018 15:37 UTC, Wuweijia wrote:
> Function1:
> bool CDynamicScheduling::GetProcLoop(
> int& nBegin,
> int& nEndPlusOne)
> {
> int curr = __sync_fetch_and_add(&m_nCurrent, m_nStep);
How large is 'm_nStep'? [Are you sure?] The overhead expense of switching
threads in valgrind would be reduced by making m_nStep as large as possible.
It looks like the code in Function2 would produce the same values regardless.
> if (curr > m_nEnd)
> {
> return false;
> }
>
> nBegin = curr;
> int limit = m_nEnd + 1;
Local variable 'limit' is unused. By itself this is unimportant, but it might
be a clue to something that is not shown here.
> nEndPlusOne = curr + m_nStep;
> return true;
> }
>
>
> Function2:
> ....
> int beginY, endY;
> while (pDS->GetProcLoop(beginY, endY)){
> for (y = beginY; y < endY; y++){
> for(x = 0; x < dstWDiv2-7; x+=8){
> vtmp0 = vld2q_u16(&pSrc[(y<<1)*srcStride+(x<<1)]);
> vtmp1 = vld2q_u16(&pSrc[((y<<1)+1)*srcStride+(x<<1)]);
I hope the actual source contains a comment such as:
Compute pDst[] as the rounded average of non-overlapping 2x2 blocks of
pixels in pSrc[].
> vst1q_u16(&pDst[y*dstStride+x], (vtmp0.val[0] + vtmp0.val[1] +
> vtmp1.val[0] + vtmp1.val[1] + vdupq_n_u16(2)) >> vdupq_n_u16(2));
> }
> for(; x < dstWDiv2; x++){
> pDst[y*dstStride+x] = (pSrc[(y<<1)*srcStride+(x<<1)] +
> pSrc[(y<<1)*srcStride+(x<<1)+1] + pSrc[((y<<1)+1)*srcStride+(x<<1)] +
> pSrc[((y<<1)+1)*srcStride+((x<<1)+1)] + 2) >> 2;
> }
> }
> }
>
> return;
> }
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most engaging tech
sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/valgrind-users
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Valgrind-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/valgrind-users