Re: [PATCH -mm crypto] AES: x86_64 asm implementation optimization

2008-05-06 Thread Huang, Ying
Hi,

On Sat, 2008-05-03 at 23:25 -0700, dean gaudet wrote:
 one of the more important details in evaluating these changes would be the 
 family/model/stepping of the processors being microbenchmarked... could 
 you folks include /proc/cpuinfo with the results?

The file attached is /proc/cpuinfo of my testing machine.

Best Regards,
Huang Ying

 also -- please drop the #define for R16 to %rsp ... it obfuscates more 
 than it helps anything.
 
 thanks
 -dean
 
 On Wed, 30 Apr 2008, Sebastian Siewior wrote:
 
  * Huang, Ying | 2008-04-25 11:11:17 [+0800]:
  
  Hi, Sebastian,
  Hi Huang,
  
  sorry for the delay.
  
  I changed the patches to group the read or write together instead of
  interleaving. Can you help me to test these new patches? The new patches
  is attached with the mail.
  The new results are attached.
  
  
  Best Regards,
  Huang Ying
  
  Sebastian
  
processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Core(TM)2 CPU  6400  @ 2.13GHz
stepping: 2
cpu MHz : 2128.006
cache size  : 2048 KB
physical id : 0
siblings: 2
core id : 0
cpu cores   : 2
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm 
constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 
ssse3 cx16 xtpr lahf_lm
bogomips: 4259.15
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

processor   : 1
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Core(TM)2 CPU  6400  @ 2.13GHz
stepping: 2
cpu MHz : 2128.006
cache size  : 2048 KB
physical id : 0
siblings: 2
core id : 1
cpu cores   : 2
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm 
constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx est tm2 
ssse3 cx16 xtpr lahf_lm
bogomips: 4256.08
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:



Re: [PATCH -mm crypto] AES: x86_64 asm implementation optimization

2008-05-06 Thread Huang, Ying
Hi, Sebastian,

On Wed, 2008-04-30 at 00:12 +0200, Sebastian Siewior wrote:
 * Huang, Ying | 2008-04-25 11:11:17 [+0800]:
 
 Hi, Sebastian,
 Hi Huang,
 
 sorry for the delay.
 
 I changed the patches to group the read or write together instead of
 interleaving. Can you help me to test these new patches? The new patches
 is attached with the mail.
 The new results are attached.

It seems that the performance degradation between step4 to step5 is
decreased. But the overall performance degradation between step0 to
step7 is still about 5%.

I also test the patches on Pentium 4 CPUs, and the performance decreased
too. So I think this optimization is CPU micro-architecture dependent.

While the dependency between instructions are reduced, more registers
(at most 3) are saved/restored before/after encryption/decryption. If
the CPU has no extra execution unit for newly independent instructions
but more registers are saved/restored, the performance will decrease.

We maybe should select different implementation based on
micro-architecture.

Best Regards,
Huang Ying

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -mm crypto] AES: x86_64 asm implementation optimization

2008-05-04 Thread dean gaudet
one of the more important details in evaluating these changes would be the 
family/model/stepping of the processors being microbenchmarked... could 
you folks include /proc/cpuinfo with the results?

also -- please drop the #define for R16 to %rsp ... it obfuscates more 
than it helps anything.

thanks
-dean

On Wed, 30 Apr 2008, Sebastian Siewior wrote:

 * Huang, Ying | 2008-04-25 11:11:17 [+0800]:
 
 Hi, Sebastian,
 Hi Huang,
 
 sorry for the delay.
 
 I changed the patches to group the read or write together instead of
 interleaving. Can you help me to test these new patches? The new patches
 is attached with the mail.
 The new results are attached.
 
 
 Best Regards,
 Huang Ying
 
 Sebastian
 
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH -mm crypto] AES: x86_64 asm implementation optimization

2008-04-29 Thread Sebastian Siewior
* Huang, Ying | 2008-04-25 11:11:17 [+0800]:

Hi, Sebastian,
Hi Huang,

sorry for the delay.

I changed the patches to group the read or write together instead of
interleaving. Can you help me to test these new patches? The new patches
is attached with the mail.
The new results are attached.


Best Regards,
Huang Ying

Sebastian


steps-txt-v2.tbz2
Description: Binary data