01/10/2025 12:24, Shreesh Adiga:
> On Wed, Oct 1, 2025 at 1:25 PM Thomas Monjalon <[email protected]> wrote:
> 
> > 29/09/2025 18:28, Shreesh Adiga:
> > > On Wed, Sep 24, 2025 at 8:28 PM Thomas Monjalon <[email protected]>
> > wrote:
> > >
> > > > Hello,
> > > >
> > > > 16/07/2025 12:34, Shreesh Adiga:
> > > > > Replace the clearing of lower 32 bits of XMM register with blend of
> > > > > zero register.
> > > > > Replace the clearing of upper 64 bits of XMM register with
> > > > _mm_move_epi64.
> > > > > Clang is able to optimize away the AND + memory operand with the
> > > > > above sequence, however GCC is still emitting the code for AND with
> > > > > memory operands which is being explicitly eliminated here.
> > > > >
> > > > > Additionally replace the 48 byte crc_xmm_shift_tab with the contents
> > of
> > > > > shf_table which is 32 bytes, achieving the same functionality.
> > > > >
> > > > > Signed-off-by: Shreesh Adiga <[email protected]>
> > > >
> > > > Sorry I'm not following.
> > > > Please could you start with defining the goal of this patch?
> > > > Is it a code simplification or a performance optimization?
> > >
> > > It is intended to be a minor performance optimization.
> >
> > Please could you give some performance numbers in the commit log?
> >
> I don't think that this change can be reliably measured. The changes only
> impact
> the last stage crc 64 to 32 fold and the last 16 bytes computation. The
> impact will only
> be a couple of clock cycles at best. Reducing the static array usage also I
> don't know
> if it can be reliably measured especially since it is not affecting the
> main loop.
> This patch can be ignored if minor incremental changes are not desirable.

Minor changes are desirable.
I'm just asking to understand the real impact of the change.
In general when doing an optimization we try to test it
and give some numbers.
Note we have a unit test for CRC in app/test/test_crc.c
but no performance test for it.

I'll wait for a review from an x86 maintainer.


Reply via email to