Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization

Li, Liang Z Thu, 12 Nov 2015 03:43:39 -0800

> >> >
> >> > I use your new code:
> >> > -------------------------------------------------
> >> >  unsigned long *p = ...
> >> >  if (p[0] || p[1] || p[2] || p[3]
> >> >      || memcmp(p+4, p, size - 4 * sizeof(unsigned long)) != 0)
> >> >          return BUFFER_NOT_ZERO;
> >> >  else
> >> >          return BUFFER_ZERO;
> >> > ---------------------------------------------------
> >> > and the result is almost the same.  I also tried the check 8, 16
> >> > long data at the beginning, same result.
> >>
> >> Interesting...  Well, all I can say is that applaud you for testing
> >> your hypothesis with the benchmark.
> >>
> >> Probably the setup cost of memcmp is too high, because the testing
> >> loop is already very optimized.
> >>
> >> Please submit the AVX2 version if it helps!
> 
> I read the email in the wrong order.  Forget about my other email.
> 
> Sorry, Juan.
>


One thing I still can't understand, why the unit test in host environment shows
'memcmp()' have better performance?

Liang
> 
> >
> > Yes, the AVX2 version really helps. I have already submitted it, could
> > you help to review it?
> >
> > I am curious about the original intention to add the SSE2 Intrinsics,
> > is the same reason?
> >
> > I even suspect the VM may impact the 'memcmp()' performance, is it
> possible?
> >
> > Liang
> >
> >> Paolo

Re: [Qemu-devel] [v2 0/2] add avx2 instruction optimization

Reply via email to