[FFmpeg-devel] [PATCH] mips/float_dsp: fix vector_fmul_window_mips on mips64

2015-03-18 Thread James Cowgill
Commit dfa920807494 (mips/float_dsp: fix a bug in vector_fmul_window_mips)
fixed vector_fmul_window_mips by unrolling the loop only 4 times, but also
removed the outer C loop and replaced it with assembly branches and pointer
arithmetic. When submitting my 64-bit porting patch I missed this new
assembly which also needed porting.

This patch fixes a bus error in the fate-float-dsp test when run on 64-bit
mips.

Signed-off-by: James Cowgill james...@cowgill.org.uk
Cc: Nedeljko Babic nedeljko.ba...@imgtec.com
---
 libavutil/mips/float_dsp_mips.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/libavutil/mips/float_dsp_mips.c b/libavutil/mips/float_dsp_mips.c
index a455687..b3a812c 100644
--- a/libavutil/mips/float_dsp_mips.c
+++ b/libavutil/mips/float_dsp_mips.c
@@ -188,10 +188,10 @@ static void vector_fmul_window_mips(float *dst, const 
float *src0,
 lwc1%[wj3],   -12(%[win_j])\n\t
 lwc1%[s0], 8(%[src0_i])\n\t
 lwc1%[s01],12(%[src0_i])   \n\t
-addiu   %[src1_j],-16  \n\t
-addiu   %[win_i],  16  \n\t
-addiu   %[win_j], -16  \n\t
-addiu   %[src0_i], 16  \n\t
+PTR_ADDIU %[src1_j],-16\n\t
+PTR_ADDIU %[win_i],16  \n\t
+PTR_ADDIU %[win_j],-16 \n\t
+PTR_ADDIU %[src0_i],16 \n\t
 swc1%[temp],   0(%[dst_i]) \n\t /* dst[i] = 
s0*wj - s1*wi; */
 swc1%[temp1],  0(%[dst_j]) \n\t /* dst[j] = 
s0*wi + s1*wj; */
 swc1%[temp2],  4(%[dst_i]) \n\t /* dst[i+1] = 
s01*wj1 - s11*wi1; */
@@ -208,8 +208,8 @@ static void vector_fmul_window_mips(float *dst, const float 
*src0,
 swc1%[temp1], -8(%[dst_j]) \n\t /* dst[j-2] = 
s0*wi2 + s1*wj2; */
 swc1%[temp2],  12(%[dst_i])\n\t /* dst[i+2] = 
s01*wj3 - s11*wi3; */
 swc1%[temp3], -12(%[dst_j])\n\t /* dst[j-3] = 
s01*wi3 + s11*wj3; */
-addiu   %[dst_i],  16  \n\t
-addiu   %[dst_j], -16  \n\t
+PTR_ADDIU %[dst_i],16  \n\t
+PTR_ADDIU %[dst_j],-16 \n\t
 bne %[win_i], %[lp_end], 1b\n\t
 : [temp]=f(temp), [temp1]=f(temp1), [temp2]=f(temp2),
   [temp3]=f(temp3), [src0_i]+r(src0_i), [win_i]+r(win_i),
-- 
2.1.4

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] mips/float_dsp: fix vector_fmul_window_mips on mips64

2015-03-18 Thread Nedeljko Babic
LGTM

Thanks,
- Nedeljko

Od: ffmpeg-devel-boun...@ffmpeg.org [ffmpeg-devel-boun...@ffmpeg.org] u ime 
korisnika James Cowgill [james...@cowgill.org.uk]
Poslato: 18. mart 2015 14:02
Za: ffmpeg-devel@ffmpeg.org
Cc: Nedeljko Babic; James Cowgill
Tema: [FFmpeg-devel] [PATCH] mips/float_dsp: fix vector_fmul_window_mips
on mips64

Commit dfa920807494 (mips/float_dsp: fix a bug in vector_fmul_window_mips)
fixed vector_fmul_window_mips by unrolling the loop only 4 times, but also
removed the outer C loop and replaced it with assembly branches and pointer
arithmetic. When submitting my 64-bit porting patch I missed this new
assembly which also needed porting.

This patch fixes a bus error in the fate-float-dsp test when run on 64-bit
mips.

Signed-off-by: James Cowgill james...@cowgill.org.uk
Cc: Nedeljko Babic nedeljko.ba...@imgtec.com
---
 libavutil/mips/float_dsp_mips.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/libavutil/mips/float_dsp_mips.c b/libavutil/mips/float_dsp_mips.c
index a455687..b3a812c 100644
--- a/libavutil/mips/float_dsp_mips.c
+++ b/libavutil/mips/float_dsp_mips.c
@@ -188,10 +188,10 @@ static void vector_fmul_window_mips(float *dst, const 
float *src0,
 lwc1%[wj3],   -12(%[win_j])\n\t
 lwc1%[s0], 8(%[src0_i])\n\t
 lwc1%[s01],12(%[src0_i])   \n\t
-addiu   %[src1_j],-16  \n\t
-addiu   %[win_i],  16  \n\t
-addiu   %[win_j], -16  \n\t
-addiu   %[src0_i], 16  \n\t
+PTR_ADDIU %[src1_j],-16\n\t
+PTR_ADDIU %[win_i],16  \n\t
+PTR_ADDIU %[win_j],-16 \n\t
+PTR_ADDIU %[src0_i],16 \n\t
 swc1%[temp],   0(%[dst_i]) \n\t /* dst[i] = 
s0*wj - s1*wi; */
 swc1%[temp1],  0(%[dst_j]) \n\t /* dst[j] = 
s0*wi + s1*wj; */
 swc1%[temp2],  4(%[dst_i]) \n\t /* dst[i+1] = 
s01*wj1 - s11*wi1; */
@@ -208,8 +208,8 @@ static void vector_fmul_window_mips(float *dst, const float 
*src0,
 swc1%[temp1], -8(%[dst_j]) \n\t /* dst[j-2] = 
s0*wi2 + s1*wj2; */
 swc1%[temp2],  12(%[dst_i])\n\t /* dst[i+2] = 
s01*wj3 - s11*wi3; */
 swc1%[temp3], -12(%[dst_j])\n\t /* dst[j-3] = 
s01*wi3 + s11*wj3; */
-addiu   %[dst_i],  16  \n\t
-addiu   %[dst_j], -16  \n\t
+PTR_ADDIU %[dst_i],16  \n\t
+PTR_ADDIU %[dst_j],-16 \n\t
 bne %[win_i], %[lp_end], 1b\n\t
 : [temp]=f(temp), [temp1]=f(temp1), [temp2]=f(temp2),
   [temp3]=f(temp3), [src0_i]+r(src0_i), [win_i]+r(win_i),
--
2.1.4

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel


Re: [FFmpeg-devel] [PATCH] mips/float_dsp: fix vector_fmul_window_mips on mips64

2015-03-18 Thread Michael Niedermayer
On Wed, Mar 18, 2015 at 02:50:29PM +, Nedeljko Babic wrote:
 LGTM

applied

thanks

 
 Thanks,
 - Nedeljko
 
 Od: ffmpeg-devel-boun...@ffmpeg.org [ffmpeg-devel-boun...@ffmpeg.org] u ime 
 korisnika James Cowgill [james...@cowgill.org.uk]
 Poslato: 18. mart 2015 14:02
 Za: ffmpeg-devel@ffmpeg.org
 Cc: Nedeljko Babic; James Cowgill
 Tema: [FFmpeg-devel] [PATCH] mips/float_dsp: fix vector_fmul_window_mips  
   on mips64
 
 Commit dfa920807494 (mips/float_dsp: fix a bug in vector_fmul_window_mips)
 fixed vector_fmul_window_mips by unrolling the loop only 4 times, but also
 removed the outer C loop and replaced it with assembly branches and pointer
 arithmetic. When submitting my 64-bit porting patch I missed this new
 assembly which also needed porting.
 
 This patch fixes a bus error in the fate-float-dsp test when run on 64-bit
 mips.
 
 Signed-off-by: James Cowgill james...@cowgill.org.uk
 Cc: Nedeljko Babic nedeljko.ba...@imgtec.com
 ---
  libavutil/mips/float_dsp_mips.c | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)
 
 diff --git a/libavutil/mips/float_dsp_mips.c b/libavutil/mips/float_dsp_mips.c
 index a455687..b3a812c 100644
 --- a/libavutil/mips/float_dsp_mips.c
 +++ b/libavutil/mips/float_dsp_mips.c
 @@ -188,10 +188,10 @@ static void vector_fmul_window_mips(float *dst, const 
 float *src0,
  lwc1%[wj3],   -12(%[win_j])\n\t
  lwc1%[s0], 8(%[src0_i])\n\t
  lwc1%[s01],12(%[src0_i])   \n\t
 -addiu   %[src1_j],-16  \n\t
 -addiu   %[win_i],  16  \n\t
 -addiu   %[win_j], -16  \n\t
 -addiu   %[src0_i], 16  \n\t
 +PTR_ADDIU %[src1_j],-16\n\t
 +PTR_ADDIU %[win_i],16  \n\t
 +PTR_ADDIU %[win_j],-16 \n\t
 +PTR_ADDIU %[src0_i],16 \n\t
  swc1%[temp],   0(%[dst_i]) \n\t /* dst[i] = 
 s0*wj - s1*wi; */
  swc1%[temp1],  0(%[dst_j]) \n\t /* dst[j] = 
 s0*wi + s1*wj; */
  swc1%[temp2],  4(%[dst_i]) \n\t /* dst[i+1] = 
 s01*wj1 - s11*wi1; */
 @@ -208,8 +208,8 @@ static void vector_fmul_window_mips(float *dst, const 
 float *src0,
  swc1%[temp1], -8(%[dst_j]) \n\t /* dst[j-2] = 
 s0*wi2 + s1*wj2; */
  swc1%[temp2],  12(%[dst_i])\n\t /* dst[i+2] = 
 s01*wj3 - s11*wi3; */
  swc1%[temp3], -12(%[dst_j])\n\t /* dst[j-3] = 
 s01*wi3 + s11*wj3; */
 -addiu   %[dst_i],  16  \n\t
 -addiu   %[dst_j], -16  \n\t
 +PTR_ADDIU %[dst_i],16  \n\t
 +PTR_ADDIU %[dst_j],-16 \n\t
  bne %[win_i], %[lp_end], 1b\n\t
  : [temp]=f(temp), [temp1]=f(temp1), [temp2]=f(temp2),
[temp3]=f(temp3), [src0_i]+r(src0_i), [win_i]+r(win_i),
 --
 2.1.4
 
 ___
 ffmpeg-devel mailing list
 ffmpeg-devel@ffmpeg.org
 http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 ___
 ffmpeg-devel mailing list
 ffmpeg-devel@ffmpeg.org
 http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
 

-- 
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Dictatorship naturally arises out of democracy, and the most aggravated
form of tyranny and slavery out of the most extreme liberty. -- Plato


signature.asc
Description: Digital signature
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel