Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem

Jerin Jacob Kollanukkaran Fri, 27 Sep 2019 07:46:10 -0700

> -----Original Message-----
> From: Jerin Jacob Kollanukkaran
> Sent: Tuesday, September 10, 2019 4:33 PM
> To: Shahaf Shuler <[email protected]>; Thomas Monjalon
> <[email protected]>; [email protected]
> Cc: Pavan Nikhilesh Bhagavatula <[email protected]>; Hemant
> Agrawal <[email protected]>; Opher Reviv <[email protected]>;
> Alex Rosenbaum <[email protected]>; Dovrat Zifroni
> <[email protected]>; Prasun Kapoor <[email protected]>; Nipun Gupta
> <[email protected]>; Wang, Xiang W <[email protected]>;
> Richardson, Bruce <[email protected]>; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]
> Subject: RE: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev
> subsystem
> 
> > Hi Jerin,
> 
> Hi Shahaf,
> 
> Sorry for delay in response(Was busy with 19.11 proposal deadline). Please see
> inline.
> 
> > >
> > > RegEx pattern matching applications:
> > > • Next Generation Firewalls (NGFW)
> > > • Deep Packet and Flow Inspection (DPI) • Intrusion Prevention
> > > Systems (IPS) • DDoS Mitigation • Network Monitoring • Data Loss
> > > Prevention (DLP) • Smart NICs • Grammar based content processing •
> > > URL, spam and adware filtering • Advanced auditing and policing of
> > > user/application security policies • Financial data mining - parsing
> > > of streamed financial feeds
> >
> > I think two more important use case to add (at least on the doc of
> > this
> > subsystem) are:
> > * application recognition
> > * memory introspection
> 
> Sure. Will add the following from John as well.
> 
> # Natural Language Processing (NLP)
> # Sentiment Analysis
> # Big Data database acceleration (Spark, Hadoop etc.) # Computational Storage
> 
> >
> >
> > > +/**
> > > + * Update the rule database of a RegEx device.
> > > + *
> > > + * @param dev_id RegEx device identifier
> > > + * @param rules
> > > + *   Points to an array of *nb_rules* objects of type *rte_regex_rule*
> > > structure
> > > + *   which contain the regex rules attributes to be updated in rule
> database.
> > > + * @param nb_rules
> > > + *   The number of PCRE rules to update the rule database.
> > > + *
> > > + * @return
> > > + *   The number of regex rules actually updated on the regex device's 
> > > rule
> > > + *   database. The return value can be less than the value of the 
> > > *nb_rules*
> > > + *   parameter when the regex devices fails to update the rule database 
> > > or
> > > + *   if invalid parameters are specified in a *rte_regex_rule*.
> > > + *   If the return value is less than *nb_rules*, the remaining PCRE 
> > > rules
> > > + *   at the end of *rules* are not consumed and the caller has to take
> > > + *   care of them and rte_errno is set accordingly.
> > > + *   Possible errno values include:
> > > + *   - -EINVAL:  Invalid device ID or rules is NULL
> > > + *   - -ENOTSUP: The last processed rule is not supported on this device.
> > > + *   - -ENOSPC: No space available in rule database.
> > > + *
> > > + * @see rte_regex_rule_db_import(), rte_regex_rule_db_export()  */
> > > +uint16_t rte_regex_rule_db_update(uint8_t dev_id, const struct
> > > +rte_regex_rule
> > > *rules,
> > > +                  uint16_t nb_rules);
> >
> > I think the function name is not too informative. If this function
> > meant to compile the rule then it should be explicit on the function name.
> 
> It is meant to be compile the rules and then  update the rule database.
> 
> I think, we can have either 1 or 2. Let me know your preference or If you have
> any name suggestion. I will change it accordingly.
> 
> 1. rte_regex_rule_db_compile()
> 2. rte_regex_rule_db_compile_update()



@Shahaf Shuler, Thoughts?


> 
> 
> > > +
> > > + */
> > > +struct rte_regex_ops {
> > > +
> > > + /* W4 */
> > > + RTE_STD_C11
> > > + union {
> > > +         uint64_t user_id;
> > > +         /**< Application specific opaque value. An application may
> > > use
> > > +          * this field to hold application specific value to share
> > > +          * between dequeue and enqueue operation.
> > > +          * Implementation should not modify this field.
> > > +          */
> > > +         void *user_ptr;
> > > +         /**< Pointer representation of *user_id* */
> > > + };
> >
> > Since we target the regex subsystem for both regex and DPI I think it
> > will be good to add another uint64_t field called connection_id.
> > Device that support DPI can refer to it as another match able field
> > when looking up for matches on the given buffer.
> >
> > This field is different from the user_id, as it is not opaque for the 
> > device.
> 
> Is this driver specific storage place where application should not touch it?
> 
> If not, Could you share the data flow of this field? Ie. Who "write" this 
> Field and
> who "read" this field.

@Shahaf Shuler Thoughts?

Based on your input, I will update the next version.

> 
> This is just for documentation, In any event we can add new fields.
> 
> If it is only for driver usage then I think, some driver may need more 8B
> Storage. In that case I think, each driver can add its on field After W4(i.e
> existing user_id) and introduce new field called match_offset in struct
> rte_regex_ops
> 
> ie. struct rte_regex_match *matches == ops + ops-> match_offset; so that, Each
> driver can add enough driver specific metadata.
> 
> 
>

Re: [dpdk-dev] [RFC PATCH v1] regexdev: introduce regexdev subsystem

Reply via email to