On Mon, 2 Nov 2015 21:49:54 +0530
Jerin Jacob <jerin.jacob at caviumnetworks.com> wrote:

> On Mon, Nov 02, 2015 at 04:39:37PM +0100, Jan Viktorin wrote:
> > On Mon, 2 Nov 2015 19:48:40 +0530
> > Jerin Jacob <jerin.jacob at caviumnetworks.com> wrote:
> >   
> > > Signed-off-by: Jerin Jacob <jerin.jacob at caviumnetworks.com>
> > > ---
> > >  app/test-acl/main.c           |   4 +
> > >  lib/librte_acl/Makefile       |   5 +
> > >  lib/librte_acl/acl.h          |   4 +
> > >  lib/librte_acl/acl_run_neon.c |  46 +++++++
> > >  lib/librte_acl/acl_run_neon.h | 290 
> > > ++++++++++++++++++++++++++++++++++++++++++
> > >  lib/librte_acl/rte_acl.c      |  25 ++++
> > >  lib/librte_acl/rte_acl.h      |   1 +
> > >  7 files changed, 375 insertions(+)
> > >  create mode 100644 lib/librte_acl/acl_run_neon.c
> > >  create mode 100644 lib/librte_acl/acl_run_neon.h
> > > 
> > > diff --git a/app/test-acl/main.c b/app/test-acl/main.c
> > > index 72ce83c..0b0c093 100644
> > > --- a/app/test-acl/main.c
> > > +++ b/app/test-acl/main.c
> > > @@ -101,6 +101,10 @@ static const struct acl_alg acl_alg[] = {
> > >           .name = "avx2",
> > >           .alg = RTE_ACL_CLASSIFY_AVX2,
> > >   },
> > > + {
> > > +         .name = "neon",
> > > +         .alg = RTE_ACL_CLASSIFY_NEON,
> > > + },
> > >  };
> > >  
> > >  static struct {
> > > diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
> > > index 7a1cf8a..27f91d5 100644
> > > --- a/lib/librte_acl/Makefile
> > > +++ b/lib/librte_acl/Makefile
> > > @@ -48,9 +48,14 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += rte_acl.c
> > >  SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_bld.c
> > >  SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
> > >  SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c
> > > +ifeq ($(CONFIG_RTE_ARCH_ARM64),y)
> > > +SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_neon.c  
> > 
> > Are the used NEON instrinsics for ACL ARMv8-specific? If so, the file 
> > should be named
> > something like acl_run_neonv8.c...
> >   
> 
> Yes, bit of armv8 specific, looks like vqtbl1q_u8 NEON instrinsics
> defined only in armv8. I could rename to acl_run_neonv8.c but keeping
> as acl_run_neon.c, may in future it can be extend to armv7 also.
> I am open to any decision, let me know your views.

OK, this sounds reasonable. Leave it as it is.

> 
> > > +else
> > >  SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
> > > +endif
> > >  
> > >  CFLAGS_acl_run_sse.o += -msse4.1
> > > +CFLAGS_acl_run_neon.o += -flax-vector-conversions 
> > > -Wno-maybe-uninitialized  
> > 
> > From man gcc:
> > 
> > -flax-vector-conversions
> >  Allow implicit conversions between vectors with differing numbers of 
> > elements and/or
> >  incompatible element types.  This option should not be used for new code.
> > 
> > I've already pointed to this in the Dave's ARMv8 patchset. They dropped it 
> > silently.
> > What is the purpose? Is it necessary?  
> 
> Yes, the same tr hi value we can representing as unsigned and signed
> based on it DFA or QRANGE .

I don't understand your answer. What is "tr hi"? What means DFA and
QRANGE here?

I just wanted to point to the note: "This option should not be used for
new code."

Jan

> 
> 
> > 
> > Jan
> >   
> > >  
> > >  #
> > >  # If the compiler supports AVX2 instructions,
> > > diff --git a/lib/librte_acl/acl.h b/lib/librte_acl/acl.h
> > > index eb4930c..09d6784 100644
> > > --- a/lib/librte_acl/acl.h
> > > +++ b/lib/librte_acl/acl.h
> > > @@ -230,6 +230,10 @@ int
> > >  rte_acl_classify_avx2(const struct rte_acl_ctx *ctx, const uint8_t 
> > > **data,
> > >   uint32_t *results, uint32_t num, uint32_t categories);
> > >    
> > --snip--
> > 
> > -- 
> >    Jan Viktorin                  E-mail: Viktorin at RehiveTech.com
> >    System Architect              Web:    www.RehiveTech.com
> >    RehiveTech
> >    Brno, Czech Republic  



-- 
   Jan Viktorin                  E-mail: Viktorin at RehiveTech.com
   System Architect              Web:    www.RehiveTech.com
   RehiveTech
   Brno, Czech Republic

Reply via email to