"Li, Liang Z" <liang.z...@intel.com> wrote:
>> Rather than trying to cater to multiple assembly instruction implementations
>> ourselves, have you tried taking the ideas in this earlier thread?
>> https://lists.gnu.org/archive/html/qemu-devel/2015-10/msg05298.html
>> Ideally, libc's memcmp() will already be using the most efficient assembly
>> instructions without us having to reproduce the work of picking the 
>> instructions
>> that work best.
> Eric, thanks for you information. I didn't notice that discussion before.
> I rewrite the buffer_find_nonzero_offset() with the 'bool memeqzero4_paolo 
> length'
> then write a test program to check a large amount of zero pages, and
> use the 'time' to
> recode the time takes by different optimization. Test result is like this:
> SSE2:
> ------------------------------------------------------
>               |            test 1         |     test 2
> ----------------------------------------------------
> Time(S):|       13.696            | 13.533  
> ------------------------------------------------
> AVX2:
> -------------------------------------------
>               |        test 1     | test 2
> -------------------------------------------
> Time (S):|      10.583      |  10.306
> -------------------------------------------
> memeqzero4_paolo:
> ---------------------------------------
>               |        test 1     | test 2
> ---------------------------------------
> Time (S):|      9.718     |  9.817
> ----------------------------------------
> Paolo's implementation has the best performance. It seems that we can
> remove the SSE2 related Intrinsics.

How should I understand that comment?  That you are about to send an
email to remove the sse2 support and that I can forget about this patch?

Thanks, Juan.

> Liang
>> --
>> Eric Blake   eblake redhat com    +1-919-301-3266
>> Libvirt virtualization library http://libvirt.org

Reply via email to