[FFmpeg-devel] [PATCH] mips/float_dsp: fix vector_fmul_window_mips on mips64
Commit dfa920807494 (mips/float_dsp: fix a bug in vector_fmul_window_mips) fixed vector_fmul_window_mips by unrolling the loop only 4 times, but also removed the outer C loop and replaced it with assembly branches and pointer arithmetic. When submitting my 64-bit porting patch I missed this new assembly which also needed porting. This patch fixes a bus error in the fate-float-dsp test when run on 64-bit mips. Signed-off-by: James Cowgill james...@cowgill.org.uk Cc: Nedeljko Babic nedeljko.ba...@imgtec.com --- libavutil/mips/float_dsp_mips.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/libavutil/mips/float_dsp_mips.c b/libavutil/mips/float_dsp_mips.c index a455687..b3a812c 100644 --- a/libavutil/mips/float_dsp_mips.c +++ b/libavutil/mips/float_dsp_mips.c @@ -188,10 +188,10 @@ static void vector_fmul_window_mips(float *dst, const float *src0, lwc1%[wj3], -12(%[win_j])\n\t lwc1%[s0], 8(%[src0_i])\n\t lwc1%[s01],12(%[src0_i]) \n\t -addiu %[src1_j],-16 \n\t -addiu %[win_i], 16 \n\t -addiu %[win_j], -16 \n\t -addiu %[src0_i], 16 \n\t +PTR_ADDIU %[src1_j],-16\n\t +PTR_ADDIU %[win_i],16 \n\t +PTR_ADDIU %[win_j],-16 \n\t +PTR_ADDIU %[src0_i],16 \n\t swc1%[temp], 0(%[dst_i]) \n\t /* dst[i] = s0*wj - s1*wi; */ swc1%[temp1], 0(%[dst_j]) \n\t /* dst[j] = s0*wi + s1*wj; */ swc1%[temp2], 4(%[dst_i]) \n\t /* dst[i+1] = s01*wj1 - s11*wi1; */ @@ -208,8 +208,8 @@ static void vector_fmul_window_mips(float *dst, const float *src0, swc1%[temp1], -8(%[dst_j]) \n\t /* dst[j-2] = s0*wi2 + s1*wj2; */ swc1%[temp2], 12(%[dst_i])\n\t /* dst[i+2] = s01*wj3 - s11*wi3; */ swc1%[temp3], -12(%[dst_j])\n\t /* dst[j-3] = s01*wi3 + s11*wj3; */ -addiu %[dst_i], 16 \n\t -addiu %[dst_j], -16 \n\t +PTR_ADDIU %[dst_i],16 \n\t +PTR_ADDIU %[dst_j],-16 \n\t bne %[win_i], %[lp_end], 1b\n\t : [temp]=f(temp), [temp1]=f(temp1), [temp2]=f(temp2), [temp3]=f(temp3), [src0_i]+r(src0_i), [win_i]+r(win_i), -- 2.1.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] mips/float_dsp: fix vector_fmul_window_mips on mips64
LGTM Thanks, - Nedeljko Od: ffmpeg-devel-boun...@ffmpeg.org [ffmpeg-devel-boun...@ffmpeg.org] u ime korisnika James Cowgill [james...@cowgill.org.uk] Poslato: 18. mart 2015 14:02 Za: ffmpeg-devel@ffmpeg.org Cc: Nedeljko Babic; James Cowgill Tema: [FFmpeg-devel] [PATCH] mips/float_dsp: fix vector_fmul_window_mips on mips64 Commit dfa920807494 (mips/float_dsp: fix a bug in vector_fmul_window_mips) fixed vector_fmul_window_mips by unrolling the loop only 4 times, but also removed the outer C loop and replaced it with assembly branches and pointer arithmetic. When submitting my 64-bit porting patch I missed this new assembly which also needed porting. This patch fixes a bus error in the fate-float-dsp test when run on 64-bit mips. Signed-off-by: James Cowgill james...@cowgill.org.uk Cc: Nedeljko Babic nedeljko.ba...@imgtec.com --- libavutil/mips/float_dsp_mips.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/libavutil/mips/float_dsp_mips.c b/libavutil/mips/float_dsp_mips.c index a455687..b3a812c 100644 --- a/libavutil/mips/float_dsp_mips.c +++ b/libavutil/mips/float_dsp_mips.c @@ -188,10 +188,10 @@ static void vector_fmul_window_mips(float *dst, const float *src0, lwc1%[wj3], -12(%[win_j])\n\t lwc1%[s0], 8(%[src0_i])\n\t lwc1%[s01],12(%[src0_i]) \n\t -addiu %[src1_j],-16 \n\t -addiu %[win_i], 16 \n\t -addiu %[win_j], -16 \n\t -addiu %[src0_i], 16 \n\t +PTR_ADDIU %[src1_j],-16\n\t +PTR_ADDIU %[win_i],16 \n\t +PTR_ADDIU %[win_j],-16 \n\t +PTR_ADDIU %[src0_i],16 \n\t swc1%[temp], 0(%[dst_i]) \n\t /* dst[i] = s0*wj - s1*wi; */ swc1%[temp1], 0(%[dst_j]) \n\t /* dst[j] = s0*wi + s1*wj; */ swc1%[temp2], 4(%[dst_i]) \n\t /* dst[i+1] = s01*wj1 - s11*wi1; */ @@ -208,8 +208,8 @@ static void vector_fmul_window_mips(float *dst, const float *src0, swc1%[temp1], -8(%[dst_j]) \n\t /* dst[j-2] = s0*wi2 + s1*wj2; */ swc1%[temp2], 12(%[dst_i])\n\t /* dst[i+2] = s01*wj3 - s11*wi3; */ swc1%[temp3], -12(%[dst_j])\n\t /* dst[j-3] = s01*wi3 + s11*wj3; */ -addiu %[dst_i], 16 \n\t -addiu %[dst_j], -16 \n\t +PTR_ADDIU %[dst_i],16 \n\t +PTR_ADDIU %[dst_j],-16 \n\t bne %[win_i], %[lp_end], 1b\n\t : [temp]=f(temp), [temp1]=f(temp1), [temp2]=f(temp2), [temp3]=f(temp3), [src0_i]+r(src0_i), [win_i]+r(win_i), -- 2.1.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] mips/float_dsp: fix vector_fmul_window_mips on mips64
On Wed, Mar 18, 2015 at 02:50:29PM +, Nedeljko Babic wrote: LGTM applied thanks Thanks, - Nedeljko Od: ffmpeg-devel-boun...@ffmpeg.org [ffmpeg-devel-boun...@ffmpeg.org] u ime korisnika James Cowgill [james...@cowgill.org.uk] Poslato: 18. mart 2015 14:02 Za: ffmpeg-devel@ffmpeg.org Cc: Nedeljko Babic; James Cowgill Tema: [FFmpeg-devel] [PATCH] mips/float_dsp: fix vector_fmul_window_mips on mips64 Commit dfa920807494 (mips/float_dsp: fix a bug in vector_fmul_window_mips) fixed vector_fmul_window_mips by unrolling the loop only 4 times, but also removed the outer C loop and replaced it with assembly branches and pointer arithmetic. When submitting my 64-bit porting patch I missed this new assembly which also needed porting. This patch fixes a bus error in the fate-float-dsp test when run on 64-bit mips. Signed-off-by: James Cowgill james...@cowgill.org.uk Cc: Nedeljko Babic nedeljko.ba...@imgtec.com --- libavutil/mips/float_dsp_mips.c | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/libavutil/mips/float_dsp_mips.c b/libavutil/mips/float_dsp_mips.c index a455687..b3a812c 100644 --- a/libavutil/mips/float_dsp_mips.c +++ b/libavutil/mips/float_dsp_mips.c @@ -188,10 +188,10 @@ static void vector_fmul_window_mips(float *dst, const float *src0, lwc1%[wj3], -12(%[win_j])\n\t lwc1%[s0], 8(%[src0_i])\n\t lwc1%[s01],12(%[src0_i]) \n\t -addiu %[src1_j],-16 \n\t -addiu %[win_i], 16 \n\t -addiu %[win_j], -16 \n\t -addiu %[src0_i], 16 \n\t +PTR_ADDIU %[src1_j],-16\n\t +PTR_ADDIU %[win_i],16 \n\t +PTR_ADDIU %[win_j],-16 \n\t +PTR_ADDIU %[src0_i],16 \n\t swc1%[temp], 0(%[dst_i]) \n\t /* dst[i] = s0*wj - s1*wi; */ swc1%[temp1], 0(%[dst_j]) \n\t /* dst[j] = s0*wi + s1*wj; */ swc1%[temp2], 4(%[dst_i]) \n\t /* dst[i+1] = s01*wj1 - s11*wi1; */ @@ -208,8 +208,8 @@ static void vector_fmul_window_mips(float *dst, const float *src0, swc1%[temp1], -8(%[dst_j]) \n\t /* dst[j-2] = s0*wi2 + s1*wj2; */ swc1%[temp2], 12(%[dst_i])\n\t /* dst[i+2] = s01*wj3 - s11*wi3; */ swc1%[temp3], -12(%[dst_j])\n\t /* dst[j-3] = s01*wi3 + s11*wj3; */ -addiu %[dst_i], 16 \n\t -addiu %[dst_j], -16 \n\t +PTR_ADDIU %[dst_i],16 \n\t +PTR_ADDIU %[dst_j],-16 \n\t bne %[win_i], %[lp_end], 1b\n\t : [temp]=f(temp), [temp1]=f(temp1), [temp2]=f(temp2), [temp3]=f(temp3), [src0_i]+r(src0_i), [win_i]+r(win_i), -- 2.1.4 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Dictatorship naturally arises out of democracy, and the most aggravated form of tyranny and slavery out of the most extreme liberty. -- Plato signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel