[x265] [PATCH] asm: avx2 code for sad_x4[16xN] for 10 bpp

2015-05-19 Thread sumalatha
# HG changeset patch # User Sumalatha Polureddy # Date 1432018871 -19800 # Tue May 19 12:31:11 2015 +0530 # Node ID 7423bf9989d3def6f009a2dc813ac245d9789100 # Parent fd1f061f22290c209560abc5fd02d6401477861a asm: avx2 code for sad_x4[16xN] for 10 bpp sse2 sad_x4[ 16x4] 2.80x976.64

[x265] [PATCH] asm: avx2 code for sad_x4[32xN] for 10 bpp

2015-05-19 Thread sumalatha
# HG changeset patch # User Sumalatha Polureddy # Date 1432020254 -19800 # Tue May 19 12:54:14 2015 +0530 # Node ID 179a50d8cc3efb9fef7b1d8f59b2d1d0f513e3ce # Parent 7423bf9989d3def6f009a2dc813ac245d9789100 asm: avx2 code for sad_x4[32xN] for 10 bpp sse2 sad_x4[ 32x8] 2.77x3007.23

[x265] [PATCH] asm: avx2 10bit code for luma_hpp[8xN]

2015-05-19 Thread rajesh
# HG changeset patch # User Rajesh Paulraj # Date 1432018444 -19800 # Tue May 19 12:24:04 2015 +0530 # Node ID 712f3f1950098d1603a662944359978e19e39752 # Parent d7b100e51e828833eee006f1da93e499ac161d28 asm: avx2 10bit code for luma_hpp[8xN] avx2: luma_hpp[ 8x4] 7.30x507.64

[x265] [PATCH] asm: avx2 10bit code for luma_hpp[16xN]

2015-05-19 Thread rajesh
# HG changeset patch # User Rajesh Paulraj # Date 1432019267 -19800 # Tue May 19 12:37:47 2015 +0530 # Node ID d0f54566d1f457f00fc071c47cbb04186e4da99e # Parent 712f3f1950098d1603a662944359978e19e39752 asm: avx2 10bit code for luma_hpp[16xN] avx2: luma_hpp[ 16x4] 7.81x955.42

[x265] [PATCH] asm: avx2 10bit code for luma_hpp[32xN],[64xN]

2015-05-19 Thread rajesh
# HG changeset patch # User Rajesh Paulraj # Date 1432020349 -19800 # Tue May 19 12:55:49 2015 +0530 # Node ID 569f678f36d731690115b27ed244970f3bc822a8 # Parent d0f54566d1f457f00fc071c47cbb04186e4da99e asm: avx2 10bit code for luma_hpp[32xN],[64xN] avx2: luma_hpp[ 32x8] 8.32x3627

[x265] [PATCH] asm: avx2 10bit code for luma_hpp[12x16] (5154.47 -> 3632.88)

2015-05-19 Thread rajesh
# HG changeset patch # User Rajesh Paulraj # Date 1432021220 -19800 # Tue May 19 13:10:20 2015 +0530 # Node ID b7f9e65a33ade32c1f14b04d69cce50cecde8ab5 # Parent 569f678f36d731690115b27ed244970f3bc822a8 asm: avx2 10bit code for luma_hpp[12x16] (5154.47 -> 3632.88) diff -r 569f678f36d7 -r b7f9

[x265] [PATCH] asm: avx2 10bit code for luma_hpp[24x32] (18855.08 -> 10742.66)

2015-05-19 Thread rajesh
# HG changeset patch # User Rajesh Paulraj # Date 1432024895 -19800 # Tue May 19 14:11:35 2015 +0530 # Node ID 9d394ee847ae33abb2a3ae06bf934eb5ebac3d03 # Parent b7f9e65a33ade32c1f14b04d69cce50cecde8ab5 asm: avx2 10bit code for luma_hpp[24x32] (18855.08 -> 10742.66) diff -r b7f9e65a33ad -r 9d

[x265] [PATCH] asm: avx2 10bit code for luma_hpp[48x64] (82440.47 -> 44731.61)

2015-05-19 Thread rajesh
# HG changeset patch # User Rajesh Paulraj # Date 1432024172 -19800 # Tue May 19 13:59:32 2015 +0530 # Node ID 6fad8107d1a6bebf92d7b38e57528b3cedf5cbd6 # Parent 9d394ee847ae33abb2a3ae06bf934eb5ebac3d03 asm: avx2 10bit code for luma_hpp[48x64] (82440.47 -> 44731.61) diff -r 9d394ee847ae -r 6f

[x265] [PATCH] asm: avx2 code for sad_x4[64xN] for 10 bpp

2015-05-19 Thread sumalatha
# HG changeset patch # User Sumalatha Polureddy # Date 1432026003 -19800 # Tue May 19 14:30:03 2015 +0530 # Node ID b3991a40f6a92a8fcfe7b24dcb02eeea7444178a # Parent 179a50d8cc3efb9fef7b1d8f59b2d1d0f513e3ce asm: avx2 code for sad_x4[64xN] for 10 bpp sse2 sad_x4[64x16] 2.65x11016.03

[x265] [PATCH] asm: removed some duplicate constants and moved others into const-a.asm

2015-05-19 Thread dnyaneshwar
# HG changeset patch # User Dnyaneshwar G # Date 143202 -19800 # Tue May 19 15:18:08 2015 +0530 # Node ID b44cdf8dc08c77e84b8707992cd0006bbf23d864 # Parent ac32faec79be9c6a60d267086b4563bd884537c0 asm: removed some duplicate constants and moved others into const-a.asm diff -r ac32faec79

[x265] [PATCH] search: add lowres MV into search MV candidate list for search ME

2015-05-19 Thread gopu
# HG changeset patch # User Gopu Govindaswamy # Date 1432035244 -19800 # Tue May 19 17:04:04 2015 +0530 # Node ID 9cab90d3b70085081f142a792a70cfde622ef720 # Parent d7b100e51e828833eee006f1da93e499ac161d28 search: add lowres MV into search MV candidate list for search ME diff -r d7b100e51e82

[x265] [PATCH] analysis: add an additional round of sub-pel refinement for inter 2Nx2N in rd 5 and 6

2015-05-19 Thread santhoshini
# HG changeset patch # User Santhoshini Sekar # Date 1432028003 -19800 # Tue May 19 15:03:23 2015 +0530 # Node ID 904ac8808858baaeaaa333b5a105af50c1107db0 # Parent d7b100e51e828833eee006f1da93e499ac161d28 analysis: add an additional round of sub-pel refinement for inter 2Nx2N in rd 5 and 6

[x265] [PATCH] asm: removed duplicate and redundant constants

2015-05-19 Thread dnyaneshwar
# HG changeset patch # User Dnyaneshwar G # Date 1432037856 -19800 # Tue May 19 17:47:36 2015 +0530 # Node ID e6fc4b6f16b32debf4a252b47ad6fc9c82364188 # Parent b44cdf8dc08c77e84b8707992cd0006bbf23d864 asm: removed duplicate and redundant constants diff -r b44cdf8dc08c -r e6fc4b6f16b3 source

[x265] [PATCH] stats: profile effectiveness of reference limit masks

2015-05-19 Thread ashok
# HG changeset patch # User Ashok Kumar Mishra # Date 1431935454 -19800 # Mon May 18 13:20:54 2015 +0530 # Node ID 778e738401f622f0d59f345bcf817bc992f595fd # Parent e1058f78e11de9844c3b6a4ce4debc2fe7210d9a stats: profile effectiveness of reference limit masks diff -r e1058f78e11d -r 778e7384

[x265] [PATCH] analysis: at RD 0/4 avoid motion references if not used by split blocks

2015-05-19 Thread ashok
# HG changeset patch # User Ashok Kumar Mishra # Date 1431935334 -19800 # Mon May 18 13:18:54 2015 +0530 # Node ID e1058f78e11de9844c3b6a4ce4debc2fe7210d9a # Parent 1e2e70f90e4484b32217c7579bca98180929cf72 analysis: at RD 0/4 avoid motion references if not used by split blocks diff -r 1e2e70

[x265] [PATCH] analysis: re-order RD 0/4 analysis to do splits before ME or intra

2015-05-19 Thread ashok
# HG changeset patch # User Ashok Kumar Mishra # Date 1431933378 -19800 # Mon May 18 12:46:18 2015 +0530 # Node ID 1e2e70f90e4484b32217c7579bca98180929cf72 # Parent d7b100e51e828833eee006f1da93e499ac161d28 analysis: re-order RD 0/4 analysis to do splits before ME or intra diff -r d7b100e51e8

[x265] [PATCH] analysis: skip intra in RD 0/4 if split was analyzed and no split CUs used intra

2015-05-19 Thread ashok
# HG changeset patch # User Ashok Kumar Mishra # Date 1431935539 -19800 # Mon May 18 13:22:19 2015 +0530 # Node ID ade6bee4010cbdb9434c60a9c0d9c4df660952c4 # Parent 778e738401f622f0d59f345bcf817bc992f595fd analysis: skip intra in RD 0/4 if split was analyzed and no split CUs used intra diff

[x265] [PATCH] analysis: model the effectiveness of --limit-ref with RD 0/4

2015-05-19 Thread ashok
# HG changeset patch # User Ashok Kumar Mishra # Date 1431935793 -19800 # Mon May 18 13:26:33 2015 +0530 # Node ID f6c9a1e184fe7a8c17744117120828458be43106 # Parent 5bc61c2bc0cec50dc33eda9638f215de21fb4bcf analysis: model the effectiveness of --limit-ref with RD 0/4 diff -r 5bc61c2bc0ce -r f

[x265] [PATCH] stats: RD 0/4 profile effectiveness of avoiding intra if split CUs did not select it

2015-05-19 Thread ashok
# HG changeset patch # User Ashok Kumar Mishra # Date 1431935682 -19800 # Mon May 18 13:24:42 2015 +0530 # Node ID 5bc61c2bc0cec50dc33eda9638f215de21fb4bcf # Parent ade6bee4010cbdb9434c60a9c0d9c4df660952c4 stats: RD 0/4 profile effectiveness of avoiding intra if split CUs did not select it

[x265] [PATCH] analysis: respect X265_REF_LIMIT_DEPTH with RD 0/4

2015-05-19 Thread ashok
# HG changeset patch # User Ashok Kumar Mishra # Date 1431936119 -19800 # Mon May 18 13:31:59 2015 +0530 # Node ID 3a8c241580093e93b5e46a0a13f631c4420215cd # Parent f6c9a1e184fe7a8c17744117120828458be43106 analysis: respect X265_REF_LIMIT_DEPTH with RD 0/4 When this flag is not set, we do no

[x265] [PATCH] cli: connect --limit-refs to param.limitReferences

2015-05-19 Thread ashok
# HG changeset patch # User Ashok Kumar Mishra # Date 1431936273 -19800 # Mon May 18 13:34:33 2015 +0530 # Node ID 536be87c9bb64cf301a5dd276a9814e819c77110 # Parent 3a8c241580093e93b5e46a0a13f631c4420215cd cli: connect --limit-refs to param.limitReferences diff -r 3a8c24158009 -r 536be87c9bb

[x265] [PATCH] stats: with the CU reference limit, even 8x8 can have skipped motion searches

2015-05-19 Thread ashok
# HG changeset patch # User Ashok Kumar Mishra # Date 1431936386 -19800 # Mon May 18 13:36:26 2015 +0530 # Node ID ef34e2d5de9768c486e499140f8335f8067b5334 # Parent 536be87c9bb64cf301a5dd276a9814e819c77110 stats: with the CU reference limit, even 8x8 can have skipped motion searches diff -r

Re: [x265] [PATCH] asm: removed duplicate and redundant constants

2015-05-19 Thread Steve Borho
On 05/19, dnyanesh...@multicorewareinc.com wrote: > # HG changeset patch > # User Dnyaneshwar G > # Date 1432037856 -19800 > # Tue May 19 17:47:36 2015 +0530 > # Node ID e6fc4b6f16b32debf4a252b47ad6fc9c82364188 > # Parent b44cdf8dc08c77e84b8707992cd0006bbf23d864 > asm: removed duplicate and

Re: [x265] [PATCH] asm: removed some duplicate constants and moved others into const-a.asm

2015-05-19 Thread Steve Borho
On 05/19, dnyanesh...@multicorewareinc.com wrote: > # HG changeset patch > # User Dnyaneshwar G > # Date 143202 -19800 > # Tue May 19 15:18:08 2015 +0530 > # Node ID b44cdf8dc08c77e84b8707992cd0006bbf23d864 > # Parent ac32faec79be9c6a60d267086b4563bd884537c0 > asm: removed some duplicate

Re: [x265] [PATCH] search: add lowres MV into search MV candidate list for search ME

2015-05-19 Thread Steve Borho
On 05/19, g...@multicorewareinc.com wrote: > # HG changeset patch > # User Gopu Govindaswamy > # Date 1432035244 -19800 > # Tue May 19 17:04:04 2015 +0530 > # Node ID 9cab90d3b70085081f142a792a70cfde622ef720 > # Parent d7b100e51e828833eee006f1da93e499ac161d28 > search: add lowres MV into sea

Re: [x265] [PATCH] analysis: add an additional round of sub-pel refinement for inter 2Nx2N in rd 5 and 6

2015-05-19 Thread Steve Borho
On 05/19, santhosh...@multicorewareinc.com wrote: > # HG changeset patch > # User Santhoshini Sekar > # Date 1432028003 -19800 > # Tue May 19 15:03:23 2015 +0530 > # Node ID 904ac8808858baaeaaa333b5a105af50c1107db0 > # Parent d7b100e51e828833eee006f1da93e499ac161d28 > analysis: add an additio

Re: [x265] [PATCH] analysis: model the effectiveness of --limit-ref with RD 0/4

2015-05-19 Thread Steve Borho
On 05/19, as...@multicorewareinc.com wrote: > # HG changeset patch > # User Ashok Kumar Mishra > # Date 1431935793 -19800 > # Mon May 18 13:26:33 2015 +0530 > # Node ID f6c9a1e184fe7a8c17744117120828458be43106 > # Parent 5bc61c2bc0cec50dc33eda9638f215de21fb4bcf > analysis: model the effective

Re: [x265] [PATCH] analysis: re-order RD 0/4 analysis to do splits before ME or intra

2015-05-19 Thread Steve Borho
On 05/19, as...@multicorewareinc.com wrote: > # HG changeset patch > # User Ashok Kumar Mishra > # Date 1431933378 -19800 > # Mon May 18 12:46:18 2015 +0530 > # Node ID 1e2e70f90e4484b32217c7579bca98180929cf72 > # Parent d7b100e51e828833eee006f1da93e499ac161d28 > analysis: re-order RD 0/4 ana

Re: [x265] [PATCH] search: add lowres MV into search MV candidate list for search ME

2015-05-19 Thread Steve Borho
On 05/19, g...@multicorewareinc.com wrote: > # HG changeset patch > # User Gopu Govindaswamy > # Date 1432035244 -19800 > # Tue May 19 17:04:04 2015 +0530 > # Node ID 9cab90d3b70085081f142a792a70cfde622ef720 > # Parent d7b100e51e828833eee006f1da93e499ac161d28 > search: add lowres MV into sea

[x265] [PATCH 2 of 2] asm: interp_4tap_vert_pX_4xN sse2

2015-05-19 Thread dtyx265
# HG changeset patch # User David T Yuen # Date 1432078001 25200 # Node ID 509f7cbf8e09d6ddec4aa58040cfd206879d59e7 # Parent 3e07cba4b2034db2b819b2e11e98ee4b851d52b5 asm: interp_4tap_vert_pX_4xN sse2 Improved register usage for addressing of output. This improvement helps 64-bit .7% to 2.5% bu

[x265] [PATCH 1 of 2] asm: interp_4tap_vert_ps_4x2 sse2

2015-05-19 Thread dtyx265
# HG changeset patch # User David T Yuen # Date 1432070824 25200 # Node ID 3e07cba4b2034db2b819b2e11e98ee4b851d52b5 # Parent d7b100e51e828833eee006f1da93e499ac161d28 asm: interp_4tap_vert_ps_4x2 sse2 Removed unneeded add instruction. In theory this should provide a small performance improvement

[x265] [PATCH 0 of 2 ] asm: interp_4tap_vert_pX_4xN sse2

2015-05-19 Thread dtyx265
Small performance improvement in register addressing to reduce the number of lea instructions. I tried these type of tweaks on the other interp_4tap_vert_pX primitives only to find mixed results and might submit more tweaks after more investigation. _

Re: [x265] [PATCH 2 of 2] asm: interp_4tap_vert_pX_4xN sse2

2015-05-19 Thread chen
Why x64 only? r4 always free in both x86 and x64 At 2015-05-20 07:33:14,dtyx...@gmail.com wrote: ># HG changeset patch ># User David T Yuen ># Date 1432078001 25200 ># Node ID 509f7cbf8e09d6ddec4aa58040cfd206879d59e7 ># Parent 3e07cba4b2034db2b819b2e11e98ee4b851d52b5 >asm: interp_4tap_vert_pX_4x

Re: [x265] [PATCH 2 of 2] asm: interp_4tap_vert_pX_4xN sse2

2015-05-19 Thread dave
On 05/19/2015 04:40 PM, chen wrote: Why x64 only? r4 always free in both x86 and x64 It hurts performance in the benchtest for x32 but I can make it cover both if that is what you want. At 2015-05-20 07:33:14,dtyx...@gmail.com wrote: ># HG changeset patch ># User Da

Re: [x265] [PATCH 2 of 2] asm: interp_4tap_vert_pX_4xN sse2

2015-05-19 Thread chen
At 2015-05-20 07:44:32,dave wrote: On 05/19/2015 04:40 PM, chen wrote: Why x64 only? r4 always free in both x86 and x64 It hurts performance in the benchtest for x32 but I can make it cover both if that is what you want. yes, please modify it, thanks At 2015-05-20 07:33:14,dtyx...@gma

Re: [x265] [PATCH] analysis: re-order RD 0/4 analysis to do splits before ME or intra

2015-05-19 Thread Steve Borho
On 05/19, Steve Borho wrote: > On 05/19, as...@multicorewareinc.com wrote: > > # HG changeset patch > > # User Ashok Kumar Mishra > > # Date 1431933378 -19800 > > # Mon May 18 12:46:18 2015 +0530 > > # Node ID 1e2e70f90e4484b32217c7579bca98180929cf72 > > # Parent d7b100e51e828833eee006f1da93e

[x265] [PATCH] asm: interp_4tap_vert_pX_4xN sse2

2015-05-19 Thread dtyx265
# HG changeset patch # User David T Yuen # Date 1432085346 25200 # Node ID e096c40ce8ff9c170bdb8caa094f53b30ebd7db7 # Parent 3e07cba4b2034db2b819b2e11e98ee4b851d52b5 asm: interp_4tap_vert_pX_4xN sse2 Improved register usage for addressing of output. This improvement helps 64-bit .7% to 2.5%. A

Re: [x265] [PATCH] asm: removed some duplicate constants and moved others into const-a.asm

2015-05-19 Thread Dnyaneshwar Gorade
Ok. I will resend this patch on latest tip. On Tue, May 19, 2015 at 8:47 PM, Steve Borho wrote: > On 05/19, dnyanesh...@multicorewareinc.com wrote: > > # HG changeset patch > > # User Dnyaneshwar G > > # Date 143202 -19800 > > # Tue May 19 15:18:08 2015 +0530 > > # Node ID b44cdf8dc08c

[x265] [PATCH] asm: avx2 code for sad_x4[48x64] (33937 -> 15279) for 10 bpp

2015-05-19 Thread sumalatha
# HG changeset patch # User Sumalatha Polureddy # Date 1432100115 -19800 # Wed May 20 11:05:15 2015 +0530 # Node ID 395ebbcf7db4dace6e706444513e5f977537ed0c # Parent 9b31a8a7bd57efededcc3884eec09f649394 asm: avx2 code for sad_x4[48x64] (33937 -> 15279) for 10 bpp sse2 sad_x4[48x64] 2.55

[x265] [PATCH] asm: removed some duplicate constants and moved others into const-a.asm

2015-05-19 Thread dnyaneshwar
# HG changeset patch # User Dnyaneshwar G # Date 1432099930 -19800 # Wed May 20 11:02:10 2015 +0530 # Node ID cdf14fea15a846f2deca436a8e057711607f41bf # Parent 9b31a8a7bd57efededcc3884eec09f649394 asm: removed some duplicate constants and moved others into const-a.asm diff -r 9b31a8a7bd

[x265] [PATCH] asm: removed duplicate constants

2015-05-19 Thread dnyaneshwar
# HG changeset patch # User Dnyaneshwar G # Date 1432102991 -19800 # Wed May 20 11:53:11 2015 +0530 # Node ID 5244b9a0d9a20262c99801a42e346e0b3e07b315 # Parent cdf14fea15a846f2deca436a8e057711607f41bf asm: removed duplicate constants diff -r cdf14fea15a8 -r 5244b9a0d9a2 source/common/x86/in

[x265] [PATCH] asm: avx2 10bit code for luma_hpp[4xN]

2015-05-19 Thread rajesh
# HG changeset patch # User Rajesh Paulraj # Date 1432102514 -19800 # Wed May 20 11:45:14 2015 +0530 # Node ID 0fce0242d05d385afa69c003fbade0477fda43a2 # Parent 9b31a8a7bd57efededcc3884eec09f649394 asm: avx2 10bit code for luma_hpp[4xN] avx2: luma_hpp[ 4x4] 4.59x423.90

Re: [x265] [PATCH] asm: removed duplicate constants

2015-05-19 Thread Deepthi Nandakumar
Thanks, there are already 2 patches with similar commit messages. Can you add more details to the commit message? On Wed, May 20, 2015 at 12:04 PM, wrote: > # HG changeset patch > # User Dnyaneshwar G > # Date 1432102991 -19800 > # Wed May 20 11:53:11 2015 +0530 > # Node ID 5244b9a0d9a2026

[x265] [PATCH] cli: fix crash when pass the unrecognized options

2015-05-19 Thread gopu
# HG changeset patch # User Gopu Govindaswamy # Date 1432104767 -19800 # Wed May 20 12:22:47 2015 +0530 # Node ID 86c12f594b8964909f311a5a39f3941f73c94523 # Parent 9b31a8a7bd57efededcc3884eec09f649394 cli: fix crash when pass the unrecognized options diff -r 9b31a8a7bd57 -r 86c12f594b89