> -----Original Message----- > From: ffmpeg-devel <ffmpeg-devel-boun...@ffmpeg.org> On Behalf Of > chen > Sent: Tuesday, December 3, 2019 4:59 PM > To: FFmpeg development discussions and patches <ffmpeg- > de...@ffmpeg.org> > Subject: Re: [FFmpeg-devel] [PATCH 3/3] avfilter/vf_convolution: add X86 > SIMD for filter_column() > > comments inline in code > > > At 2019-12-03 15:52:07, xuju...@sjtu.edu.cn wrote: > >From: Xu Jun <xuju...@sjtu.edu.cn> [...] > >+ > >+ cvtdq2ps m4, m4 > >+ mulps m4, m0 ; sum *= rdiv > >+ addps m4, m1 ; sum += bias > > >+ addps m4, m5 ; sum += 0.5 > I don't know how about precision mismatch if we pre-compute (bias+0.5) I think it is hard to prove it is safe to do pre-compute.
> > > >+ cvttps2dq m4, m4 > >+ packssdw m4, m4 > >+ packuswb m4, m4 > >+ movss [dstq + dst_offq], m4 > >+ add c_offq, mmsize/4 > >+ add dst_offq, mmsize/4 > >+ > >+ add off16q, mmsize/4 > >+ cmp off16q, widthq > >+ jl .loop16 > >+ > >+ add widthq, rq > >+ cmp off16q, widthq > >+ jge .paraend > >+ > > >+ .loopr: > no idea about this loop, if we can read beyond, we can reuse above SIMD > code Reuse above SIMD code may write to the memory that does not belong to this slice-thread. IMO, the code to handle remainder columns is still necessary. Ruiling > > > >+ xor sumd, sumd > >+ xor iq, iq > >+ .loopr_i: > >+ mov ciq, [ptrq + iq * gprsize] > >+ movzx rd, byte [ciq + c_offq] > >+ imul rd, [matrixq + 4*iq] > >+ add sumd, rd > >+ > >+ add iq, 1 > >+ cmp iq, radq > >+ jl .loopr_i > >+ > >+ pxor m4, m4 > >+ cvtsi2ss m4, sumd > >+ mulss m4, m0 ; sum *= rdiv > >+ addss m4, m1 ; sum += bias > >+ addss m4, m5 ; sum += 0.5 > >+ cvttps2dq m4, m4 > >+ packssdw m4, m4 > >+ packuswb m4, m4 > >+ movd sumd, m4 > >+ mov [dstq + dst_offq], sumb > >+ add c_offq, 1 > >+ add dst_offq, 1 > >+ add off16q, 1 > >+ cmp off16q, widthq > >+ jl .loopr > >+ > >+ .paraend: > >+ sub c_offq, widthq > >+ sub dst_offq, widthq > >+ add c_offq, strideq > >+ add dst_offq, dstrideq > >+ > >+ sub heightq, 1 > >+ cmp heightq, 0 > >+ jg .loopy > >+ > >+.end: > >+ RET > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".