date:20160603

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Arnon Warshavsky

I

On Fri, Jun 3, 2016 at 10:23 PM, Wiles, Keith  wrote:

>
> On 6/3/16, 2:18 PM, "Neil Horman"  wrote:
>
> >On Fri, Jun 03, 2016 at 07:07:50PM +, Wiles, Keith wrote:
> >> On 6/3/16, 2:00 PM, "dev on behalf of Wiles, Keith" <
> dev-bounces at dpdk.org on behalf of keith.wiles at intel.com> wrote:
> >>
> >> >On 6/3/16, 1:52 PM, "Arnon Warshavsky"  arnon at qwilt.com>> wrote:
> >> >
> >> >
> >> >
> >> >On Fri, Jun 3, 2016 at 9:38 PM, Neil Horman  > wrote:
> >> >On Fri, Jun 03, 2016 at 06:29:13PM +, Wiles, Keith wrote:
> >> >>
> >> >> On 6/3/16, 12:44 PM, "Neil Horman"  nhorman at tuxdriver.com>> wrote:
> >> >>
> >> >> >On Fri, Jun 03, 2016 at 04:04:14PM +, Wiles, Keith wrote:
> >> >> >> Sorry, I deleted all of the text as it was getting a bit long.
> >> >> >>
> >> >> >> Here are my thoughts as of now, which is a combination of many
> suggestions I read from everyone?s emails. I hope this is not too hard to
> understand.
> >> >> >>
> >> >> >> - Break out the current command line options out of the DPDK
> common code and move into a new lib.
> >> >> >>   - At this point I was thinking of keeping the
> rte_eal_init(args, argv) API and just have it pass the args/argv to the new
> lib to create the data storage.
> >> >> >>  - Maybe move the rte_eal_init() API to the new lib or keep
> it in the common eal code. Do not want to go hog wild.
> >> >> >>   - The rte_eal_init(args, argv) would then call to the new API
> rte_eal_initialize(void), which in turn queries the data storage. (still
> thinking here)
> >> >> >These three items seem to be the exact opposite of my suggestion.
> The point of
> >> >> >this change was to segregate the parsing of configuration away from
> the
> >> >> >initalization dpdk using that configurtion.  By keeping
> rte_eal_init in such a
> >> >> >way that the command line is directly passed into it, you've not
> changed that
> >> >> >implicit binding to command line options.
> >> >>
> >> >> Neil,
> >> >>
> >> >> You maybe reading the above wrong or I wrote it wrong, which is a
> high possibility. I want to move the command line parsing out of DPDK an
> into a library, but I still believe I need to provide some backward
> compatibility for ABI and to reduce the learning curve. The current
> applications can still call the rte_eal_init(), which then calls the new
> lib parser for dpdk command line options and then calls
> rte_eal_initialize() or move to the new API rte_eal_initialize() preceded
> by a new library call to parse the old command line args. At some point we
> can deprecate the rte_eal_init() if we think it is reasonable.
> >> >>
> >> >> >
> >> >> >I can understand if you want to keep rte_eal_init as is for ABI
> purposes, but
> >> >> >then you should create an rte_eal_init2(foo), where foo is some
> handle to in
> >> >> >memory parsed configuration, so that applications can preform that
> separation.
> >> >>
> >> >> I think you describe what I had planned here. The
> rte_eal_initialize() routine is the new rte_eal_init2() API and the
> rte_eal_init() was only for backward compatibility was my thinking. I
> figured the argument to rte_eal_initialize() would be something to be
> decided, but it will mostly likely be some type of pointer to the storage.
> >> >>
> >> >> I hope that clears that up, but let me know.
> >> >>
> >> >yes, that clarifies your thinking, and I agree with it.  Thank you!
> >> >Neil
> >> >
> >> >> ++Keith
> >> >>
> >> >> >
> >> >> >Neil
> >> >> >
> >> >> >>   - The example apps args needs to be passed to the examples as
> is for now, then we can convert them one at a time if needed.
> >> >> >>
> >> >> >> - I would like to keep the storage of the data separate from the
> file parser as they can use the ?set? routines to build the data storage up.
> >> >> >>   - Keeping them split allows for new parsers to be created,
> while keeping the data storage from changing.
> >> >> >> - The rte_cfg code could be modified to use the new configuration
> if someone wants to take on that task ?
> >> >> >>
> >> >> >> - Next is the data storage and how we can access the data in a
> clean simple way.
> >> >> >> - I want to have some simple level of hierarchy in the data.
> >> >> >>   - Having a string containing at least two levels
> ?primary:secondary?.
> >> >> >>  - Primary string is something like ?EAL? or ?Pktgen? or
> ?testpmd? to divide the data storage into logical major groups.
> >> >> >> - The primary allows us to have groups and then we can
> have common secondary strings in different groups if needed.
> >> >> >>  - Secondary string can be whatever the developer of that
> group would like e.g. simple ?EAL:foobar?, two levels ?testpmd:foo.bar?
> >> >> >>
> >> >> >>   - The secondary string is treated as a single string if it has
> a hierarchy or not, but referencing a single value in the data storage.
> >> >> >>  - Key value pairs (KVP) or a hashmap data store.
> >> >> >> - The key here is the

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Arnon Warshavsky

On Fri, Jun 3, 2016 at 9:38 PM, Neil Horman  wrote:

> On Fri, Jun 03, 2016 at 06:29:13PM +, Wiles, Keith wrote:
> >
> > On 6/3/16, 12:44 PM, "Neil Horman"  wrote:
> >
> > >On Fri, Jun 03, 2016 at 04:04:14PM +, Wiles, Keith wrote:
> > >> Sorry, I deleted all of the text as it was getting a bit long.
> > >>
> > >> Here are my thoughts as of now, which is a combination of many
> suggestions I read from everyone?s emails. I hope this is not too hard to
> understand.
> > >>
> > >> - Break out the current command line options out of the DPDK common
> code and move into a new lib.
> > >>   - At this point I was thinking of keeping the rte_eal_init(args,
> argv) API and just have it pass the args/argv to the new lib to create the
> data storage.
> > >>  - Maybe move the rte_eal_init() API to the new lib or keep it in
> the common eal code. Do not want to go hog wild.
> > >>   - The rte_eal_init(args, argv) would then call to the new API
> rte_eal_initialize(void), which in turn queries the data storage. (still
> thinking here)
> > >These three items seem to be the exact opposite of my suggestion.  The
> point of
> > >this change was to segregate the parsing of configuration away from the
> > >initalization dpdk using that configurtion.  By keeping rte_eal_init in
> such a
> > >way that the command line is directly passed into it, you've not
> changed that
> > >implicit binding to command line options.
> >
> > Neil,
> >
> > You maybe reading the above wrong or I wrote it wrong, which is a high
> possibility. I want to move the command line parsing out of DPDK an into a
> library, but I still believe I need to provide some backward compatibility
> for ABI and to reduce the learning curve. The current applications can
> still call the rte_eal_init(), which then calls the new lib parser for dpdk
> command line options and then calls rte_eal_initialize() or move to the new
> API rte_eal_initialize() preceded by a new library call to parse the old
> command line args. At some point we can deprecate the rte_eal_init() if we
> think it is reasonable.
> >
> > >
> > >I can understand if you want to keep rte_eal_init as is for ABI
> purposes, but
> > >then you should create an rte_eal_init2(foo), where foo is some handle
> to in
> > >memory parsed configuration, so that applications can preform that
> separation.
> >
> > I think you describe what I had planned here. The rte_eal_initialize()
> routine is the new rte_eal_init2() API and the rte_eal_init() was only for
> backward compatibility was my thinking. I figured the argument to
> rte_eal_initialize() would be something to be decided, but it will mostly
> likely be some type of pointer to the storage.
> >
> > I hope that clears that up, but let me know.
> >
> yes, that clarifies your thinking, and I agree with it.  Thank you!
> Neil
>
> > ++Keith
> >
> > >
> > >Neil
> > >
> > >>   - The example apps args needs to be passed to the examples as is
> for now, then we can convert them one at a time if needed.
> > >>
> > >> - I would like to keep the storage of the data separate from the file
> parser as they can use the ?set? routines to build the data storage up.
> > >>   - Keeping them split allows for new parsers to be created, while
> keeping the data storage from changing.
> > >> - The rte_cfg code could be modified to use the new configuration if
> someone wants to take on that task ?
> > >>
> > >> - Next is the data storage and how we can access the data in a clean
> simple way.
> > >> - I want to have some simple level of hierarchy in the data.
> > >>   - Having a string containing at least two levels
> ?primary:secondary?.
> > >>  - Primary string is something like ?EAL? or ?Pktgen? or
> ?testpmd? to divide the data storage into logical major groups.
> > >> - The primary allows us to have groups and then we can have
> common secondary strings in different groups if needed.
> > >>  - Secondary string can be whatever the developer of that group
> would like e.g. simple ?EAL:foobar?, two levels ?testpmd:foo.bar?
> > >>
> > >>   - The secondary string is treated as a single string if it has a
> hierarchy or not, but referencing a single value in the data storage.
> > >>  - Key value pairs (KVP) or a hashmap data store.
> > >> - The key here is the whole string ?EAL:foobar? not just
> ?foobar? secondary string.
> > >>- If we want to have the two split I am ok with that as
> well meaning the API would be:
> > >>  rte_map_get(mapObj, ?EAL?, ?foo.bar?);
> > >>  rte_map_set(mapObj, ?EAL?, ?foo.bar?, value);
> > >>- Have the primary as a different section in the data
> store, would allow for dumping that section maybe easier, not sure.
> > >>   - I am leaning toward
> > >>  - Not going to try splitting up the string or parse it as it is
> up to the developer to make it unique in the data store.
> > >> - Use a code design to make the strings simple to use without having
>

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Wiles, Keith


On 6/3/16, 2:18 PM, "Neil Horman"  wrote:

>On Fri, Jun 03, 2016 at 07:07:50PM +, Wiles, Keith wrote:
>> On 6/3/16, 2:00 PM, "dev on behalf of Wiles, Keith" > on behalf of keith.wiles at intel.com> wrote:
>> 
>> >On 6/3/16, 1:52 PM, "Arnon Warshavsky" mailto:arnon at 
>> >qwilt.com>> wrote:
>> >
>> >
>> >
>> >On Fri, Jun 3, 2016 at 9:38 PM, Neil Horman > >tuxdriver.com> wrote:
>> >On Fri, Jun 03, 2016 at 06:29:13PM +, Wiles, Keith wrote:
>> >>
>> >> On 6/3/16, 12:44 PM, "Neil Horman" > >> tuxdriver.com> wrote:
>> >>
>> >> >On Fri, Jun 03, 2016 at 04:04:14PM +, Wiles, Keith wrote:
>> >> >> Sorry, I deleted all of the text as it was getting a bit long.
>> >> >>
>> >> >> Here are my thoughts as of now, which is a combination of many 
>> >> >> suggestions I read from everyone?s emails. I hope this is not too hard 
>> >> >> to understand.
>> >> >>
>> >> >> - Break out the current command line options out of the DPDK common 
>> >> >> code and move into a new lib.
>> >> >>   - At this point I was thinking of keeping the rte_eal_init(args, 
>> >> >> argv) API and just have it pass the args/argv to the new lib to create 
>> >> >> the data storage.
>> >> >>  - Maybe move the rte_eal_init() API to the new lib or keep it in 
>> >> >> the common eal code. Do not want to go hog wild.
>> >> >>   - The rte_eal_init(args, argv) would then call to the new API 
>> >> >> rte_eal_initialize(void), which in turn queries the data storage. 
>> >> >> (still thinking here)
>> >> >These three items seem to be the exact opposite of my suggestion.  The 
>> >> >point of
>> >> >this change was to segregate the parsing of configuration away from the
>> >> >initalization dpdk using that configurtion.  By keeping rte_eal_init in 
>> >> >such a
>> >> >way that the command line is directly passed into it, you've not changed 
>> >> >that
>> >> >implicit binding to command line options.
>> >>
>> >> Neil,
>> >>
>> >> You maybe reading the above wrong or I wrote it wrong, which is a high 
>> >> possibility. I want to move the command line parsing out of DPDK an into 
>> >> a library, but I still believe I need to provide some backward 
>> >> compatibility for ABI and to reduce the learning curve. The current 
>> >> applications can still call the rte_eal_init(), which then calls the new 
>> >> lib parser for dpdk command line options and then calls 
>> >> rte_eal_initialize() or move to the new API rte_eal_initialize() preceded 
>> >> by a new library call to parse the old command line args. At some point 
>> >> we can deprecate the rte_eal_init() if we think it is reasonable.
>> >>
>> >> >
>> >> >I can understand if you want to keep rte_eal_init as is for ABI 
>> >> >purposes, but
>> >> >then you should create an rte_eal_init2(foo), where foo is some handle 
>> >> >to in
>> >> >memory parsed configuration, so that applications can preform that 
>> >> >separation.
>> >>
>> >> I think you describe what I had planned here. The rte_eal_initialize() 
>> >> routine is the new rte_eal_init2() API and the rte_eal_init() was only 
>> >> for backward compatibility was my thinking. I figured the argument to 
>> >> rte_eal_initialize() would be something to be decided, but it will mostly 
>> >> likely be some type of pointer to the storage.
>> >>
>> >> I hope that clears that up, but let me know.
>> >>
>> >yes, that clarifies your thinking, and I agree with it.  Thank you!
>> >Neil
>> >
>> >> ++Keith
>> >>
>> >> >
>> >> >Neil
>> >> >
>> >> >>   - The example apps args needs to be passed to the examples as is for 
>> >> >> now, then we can convert them one at a time if needed.
>> >> >>
>> >> >> - I would like to keep the storage of the data separate from the file 
>> >> >> parser as they can use the ?set? routines to build the data storage up.
>> >> >>   - Keeping them split allows for new parsers to be created, while 
>> >> >> keeping the data storage from changing.
>> >> >> - The rte_cfg code could be modified to use the new configuration if 
>> >> >> someone wants to take on that task ?
>> >> >>
>> >> >> - Next is the data storage and how we can access the data in a clean 
>> >> >> simple way.
>> >> >> - I want to have some simple level of hierarchy in the data.
>> >> >>   - Having a string containing at least two levels ?primary:secondary?.
>> >> >>  - Primary string is something like ?EAL? or ?Pktgen? or ?testpmd? 
>> >> >> to divide the data storage into logical major groups.
>> >> >> - The primary allows us to have groups and then we can have 
>> >> >> common secondary strings in different groups if needed.
>> >> >>  - Secondary string can be whatever the developer of that group 
>> >> >> would like e.g. simple ?EAL:foobar?, two levels ?testpmd:foo.bar?
>> >> >>
>> >> >>   - The secondary string is treated as a single string if it has a 
>> >> >> hierarchy or not, but referencing a single value in the data storage.
>> >> >>  - Key value pairs (KVP)

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Wiles, Keith

On 6/3/16, 2:00 PM, "dev on behalf of Wiles, Keith"  wrote:

>On 6/3/16, 1:52 PM, "Arnon Warshavsky" mailto:arnon at 
>qwilt.com>> wrote:
>
>
>
>On Fri, Jun 3, 2016 at 9:38 PM, Neil Horman tuxdriver.com> wrote:
>On Fri, Jun 03, 2016 at 06:29:13PM +, Wiles, Keith wrote:
>>
>> On 6/3/16, 12:44 PM, "Neil Horman" mailto:nhorman 
>> at tuxdriver.com>> wrote:
>>
>> >On Fri, Jun 03, 2016 at 04:04:14PM +, Wiles, Keith wrote:
>> >> Sorry, I deleted all of the text as it was getting a bit long.
>> >>
>> >> Here are my thoughts as of now, which is a combination of many 
>> >> suggestions I read from everyone?s emails. I hope this is not too hard to 
>> >> understand.
>> >>
>> >> - Break out the current command line options out of the DPDK common code 
>> >> and move into a new lib.
>> >>   - At this point I was thinking of keeping the rte_eal_init(args, argv) 
>> >> API and just have it pass the args/argv to the new lib to create the data 
>> >> storage.
>> >>  - Maybe move the rte_eal_init() API to the new lib or keep it in the 
>> >> common eal code. Do not want to go hog wild.
>> >>   - The rte_eal_init(args, argv) would then call to the new API 
>> >> rte_eal_initialize(void), which in turn queries the data storage. (still 
>> >> thinking here)
>> >These three items seem to be the exact opposite of my suggestion.  The 
>> >point of
>> >this change was to segregate the parsing of configuration away from the
>> >initalization dpdk using that configurtion.  By keeping rte_eal_init in 
>> >such a
>> >way that the command line is directly passed into it, you've not changed 
>> >that
>> >implicit binding to command line options.
>>
>> Neil,
>>
>> You maybe reading the above wrong or I wrote it wrong, which is a high 
>> possibility. I want to move the command line parsing out of DPDK an into a 
>> library, but I still believe I need to provide some backward compatibility 
>> for ABI and to reduce the learning curve. The current applications can still 
>> call the rte_eal_init(), which then calls the new lib parser for dpdk 
>> command line options and then calls rte_eal_initialize() or move to the new 
>> API rte_eal_initialize() preceded by a new library call to parse the old 
>> command line args. At some point we can deprecate the rte_eal_init() if we 
>> think it is reasonable.
>>
>> >
>> >I can understand if you want to keep rte_eal_init as is for ABI purposes, 
>> >but
>> >then you should create an rte_eal_init2(foo), where foo is some handle to in
>> >memory parsed configuration, so that applications can preform that 
>> >separation.
>>
>> I think you describe what I had planned here. The rte_eal_initialize() 
>> routine is the new rte_eal_init2() API and the rte_eal_init() was only for 
>> backward compatibility was my thinking. I figured the argument to 
>> rte_eal_initialize() would be something to be decided, but it will mostly 
>> likely be some type of pointer to the storage.
>>
>> I hope that clears that up, but let me know.
>>
>yes, that clarifies your thinking, and I agree with it.  Thank you!
>Neil
>
>> ++Keith
>>
>> >
>> >Neil
>> >
>> >>   - The example apps args needs to be passed to the examples as is for 
>> >> now, then we can convert them one at a time if needed.
>> >>
>> >> - I would like to keep the storage of the data separate from the file 
>> >> parser as they can use the ?set? routines to build the data storage up.
>> >>   - Keeping them split allows for new parsers to be created, while 
>> >> keeping the data storage from changing.
>> >> - The rte_cfg code could be modified to use the new configuration if 
>> >> someone wants to take on that task ?
>> >>
>> >> - Next is the data storage and how we can access the data in a clean 
>> >> simple way.
>> >> - I want to have some simple level of hierarchy in the data.
>> >>   - Having a string containing at least two levels ?primary:secondary?.
>> >>  - Primary string is something like ?EAL? or ?Pktgen? or ?testpmd? to 
>> >> divide the data storage into logical major groups.
>> >> - The primary allows us to have groups and then we can have 
>> >> common secondary strings in different groups if needed.
>> >>  - Secondary string can be whatever the developer of that group would 
>> >> like e.g. simple ?EAL:foobar?, two levels ?testpmd:foo.bar?
>> >>
>> >>   - The secondary string is treated as a single string if it has a 
>> >> hierarchy or not, but referencing a single value in the data storage.
>> >>  - Key value pairs (KVP) or a hashmap data store.
>> >> - The key here is the whole string ?EAL:foobar? not just ?foobar? 
>> >> secondary string.
>> >>- If we want to have the two split I am ok with that as well 
>> >> meaning the API would be:
>> >>  rte_map_get(mapObj, ?EAL?, ?foo.bar?);
>> >>  rte_map_set(mapObj, ?EAL?, ?foo.bar?, value);
>> >>- Have the primary as a different section in the data store, 
>> >> would allow

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Wiles, Keith

On 6/3/16, 1:52 PM, "Arnon Warshavsky" mailto:arnon at 
qwilt.com>> wrote:



On Fri, Jun 3, 2016 at 9:38 PM, Neil Horman mailto:nhorman at tuxdriver.com>> wrote:
On Fri, Jun 03, 2016 at 06:29:13PM +, Wiles, Keith wrote:
>
> On 6/3/16, 12:44 PM, "Neil Horman" mailto:nhorman 
> at tuxdriver.com>> wrote:
>
> >On Fri, Jun 03, 2016 at 04:04:14PM +, Wiles, Keith wrote:
> >> Sorry, I deleted all of the text as it was getting a bit long.
> >>
> >> Here are my thoughts as of now, which is a combination of many suggestions 
> >> I read from everyone?s emails. I hope this is not too hard to understand.
> >>
> >> - Break out the current command line options out of the DPDK common code 
> >> and move into a new lib.
> >>   - At this point I was thinking of keeping the rte_eal_init(args, argv) 
> >> API and just have it pass the args/argv to the new lib to create the data 
> >> storage.
> >>  - Maybe move the rte_eal_init() API to the new lib or keep it in the 
> >> common eal code. Do not want to go hog wild.
> >>   - The rte_eal_init(args, argv) would then call to the new API 
> >> rte_eal_initialize(void), which in turn queries the data storage. (still 
> >> thinking here)
> >These three items seem to be the exact opposite of my suggestion.  The point 
> >of
> >this change was to segregate the parsing of configuration away from the
> >initalization dpdk using that configurtion.  By keeping rte_eal_init in such 
> >a
> >way that the command line is directly passed into it, you've not changed that
> >implicit binding to command line options.
>
> Neil,
>
> You maybe reading the above wrong or I wrote it wrong, which is a high 
> possibility. I want to move the command line parsing out of DPDK an into a 
> library, but I still believe I need to provide some backward compatibility 
> for ABI and to reduce the learning curve. The current applications can still 
> call the rte_eal_init(), which then calls the new lib parser for dpdk command 
> line options and then calls rte_eal_initialize() or move to the new API 
> rte_eal_initialize() preceded by a new library call to parse the old command 
> line args. At some point we can deprecate the rte_eal_init() if we think it 
> is reasonable.
>
> >
> >I can understand if you want to keep rte_eal_init as is for ABI purposes, but
> >then you should create an rte_eal_init2(foo), where foo is some handle to in
> >memory parsed configuration, so that applications can preform that 
> >separation.
>
> I think you describe what I had planned here. The rte_eal_initialize() 
> routine is the new rte_eal_init2() API and the rte_eal_init() was only for 
> backward compatibility was my thinking. I figured the argument to 
> rte_eal_initialize() would be something to be decided, but it will mostly 
> likely be some type of pointer to the storage.
>
> I hope that clears that up, but let me know.
>
yes, that clarifies your thinking, and I agree with it.  Thank you!
Neil

> ++Keith
>
> >
> >Neil
> >
> >>   - The example apps args needs to be passed to the examples as is for 
> >> now, then we can convert them one at a time if needed.
> >>
> >> - I would like to keep the storage of the data separate from the file 
> >> parser as they can use the ?set? routines to build the data storage up.
> >>   - Keeping them split allows for new parsers to be created, while keeping 
> >> the data storage from changing.
> >> - The rte_cfg code could be modified to use the new configuration if 
> >> someone wants to take on that task ?
> >>
> >> - Next is the data storage and how we can access the data in a clean 
> >> simple way.
> >> - I want to have some simple level of hierarchy in the data.
> >>   - Having a string containing at least two levels ?primary:secondary?.
> >>  - Primary string is something like ?EAL? or ?Pktgen? or ?testpmd? to 
> >> divide the data storage into logical major groups.
> >> - The primary allows us to have groups and then we can have common 
> >> secondary strings in different groups if needed.
> >>  - Secondary string can be whatever the developer of that group would 
> >> like e.g. simple ?EAL:foobar?, two levels ?testpmd:foo.bar?
> >>
> >>   - The secondary string is treated as a single string if it has a 
> >> hierarchy or not, but referencing a single value in the data storage.
> >>  - Key value pairs (KVP) or a hashmap data store.
> >> - The key here is the whole string ?EAL:foobar? not just ?foobar? 
> >> secondary string.
> >>- If we want to have the two split I am ok with that as well 
> >> meaning the API would be:
> >>  rte_map_get(mapObj, ?EAL?, ?foo.bar?);
> >>  rte_map_set(mapObj, ?EAL?, ?foo.bar?, value);
> >>- Have the primary as a different section in the data store, 
> >> would allow for dumping that section maybe easier, not sure.
> >>   - I am leaning toward
> >>  - Not going to try splitting up the string or parse it as it is up to 
> >> the developer to

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Wiles, Keith


On 6/3/16, 12:44 PM, "Neil Horman"  wrote:

>On Fri, Jun 03, 2016 at 04:04:14PM +, Wiles, Keith wrote:
>> Sorry, I deleted all of the text as it was getting a bit long.
>> 
>> Here are my thoughts as of now, which is a combination of many suggestions I 
>> read from everyone?s emails. I hope this is not too hard to understand.
>> 
>> - Break out the current command line options out of the DPDK common code and 
>> move into a new lib.
>>   - At this point I was thinking of keeping the rte_eal_init(args, argv) API 
>> and just have it pass the args/argv to the new lib to create the data 
>> storage.
>>  - Maybe move the rte_eal_init() API to the new lib or keep it in the 
>> common eal code. Do not want to go hog wild.
>>   - The rte_eal_init(args, argv) would then call to the new API 
>> rte_eal_initialize(void), which in turn queries the data storage. (still 
>> thinking here)
>These three items seem to be the exact opposite of my suggestion.  The point of
>this change was to segregate the parsing of configuration away from the
>initalization dpdk using that configurtion.  By keeping rte_eal_init in such a
>way that the command line is directly passed into it, you've not changed that
>implicit binding to command line options.

Neil,

You maybe reading the above wrong or I wrote it wrong, which is a high 
possibility. I want to move the command line parsing out of DPDK an into a 
library, but I still believe I need to provide some backward compatibility for 
ABI and to reduce the learning curve. The current applications can still call 
the rte_eal_init(), which then calls the new lib parser for dpdk command line 
options and then calls rte_eal_initialize() or move to the new API 
rte_eal_initialize() preceded by a new library call to parse the old command 
line args. At some point we can deprecate the rte_eal_init() if we think it is 
reasonable.

>
>I can understand if you want to keep rte_eal_init as is for ABI purposes, but
>then you should create an rte_eal_init2(foo), where foo is some handle to in
>memory parsed configuration, so that applications can preform that separation.

I think you describe what I had planned here. The rte_eal_initialize() routine 
is the new rte_eal_init2() API and the rte_eal_init() was only for backward 
compatibility was my thinking. I figured the argument to rte_eal_initialize() 
would be something to be decided, but it will mostly likely be some type of 
pointer to the storage.

I hope that clears that up, but let me know.

++Keith

>
>Neil
>
>>   - The example apps args needs to be passed to the examples as is for now, 
>> then we can convert them one at a time if needed.
>> 
>> - I would like to keep the storage of the data separate from the file parser 
>> as they can use the ?set? routines to build the data storage up.
>>   - Keeping them split allows for new parsers to be created, while keeping 
>> the data storage from changing.
>> - The rte_cfg code could be modified to use the new configuration if someone 
>> wants to take on that task ?
>> 
>> - Next is the data storage and how we can access the data in a clean simple 
>> way.
>> - I want to have some simple level of hierarchy in the data.
>>   - Having a string containing at least two levels ?primary:secondary?.
>>  - Primary string is something like ?EAL? or ?Pktgen? or ?testpmd? to 
>> divide the data storage into logical major groups.
>> - The primary allows us to have groups and then we can have common 
>> secondary strings in different groups if needed.
>>  - Secondary string can be whatever the developer of that group would 
>> like e.g. simple ?EAL:foobar?, two levels ?testpmd:foo.bar?
>> 
>>   - The secondary string is treated as a single string if it has a hierarchy 
>> or not, but referencing a single value in the data storage.
>>  - Key value pairs (KVP) or a hashmap data store.
>> - The key here is the whole string ?EAL:foobar? not just ?foobar? 
>> secondary string.
>>- If we want to have the two split I am ok with that as well 
>> meaning the API would be:
>>  rte_map_get(mapObj, ?EAL?, ?foo.bar?);
>>  rte_map_set(mapObj, ?EAL?, ?foo.bar?, value);
>>- Have the primary as a different section in the data store, 
>> would allow for dumping that section maybe easier, not sure.
>>   - I am leaning toward
>>  - Not going to try splitting up the string or parse it as it is up to 
>> the developer to make it unique in the data store.
>> - Use a code design to make the strings simple to use without having typos 
>> be a problem.
>>- Not sure what the design is yet, but I do not want to have to concat 
>> two string or split strings in the code.
>> 
>> This is as far as I have gotten and got tired of typing ?
>> 
>> I hope this will satisfy most everyone?s needs for now.
>> 
>> 
>> Regards,
>> Keith
>> 
>> 
>> 
>

[dpdk-dev] RFC: DPDK Long Term Support

2016-06-03 Thread Thomas Monjalon

Hi,

2016-06-03 15:07, Mcnamara, John:
> Introduction
> 
> 
> This document sets out a proposal for a DPDK Long Term Support release (LTS).

In general, LTS refer to a longer maintenance than than regular one.
Here we are talking to doing some maintenance as stable releases first.
Currently we have no maintenance at all.
So I suggest to differentiate "stable branches" and "LTS" for some stable 
branches.

> The purpose of the DPDK LTS will be to maintain a stable release of DPDK with
> backported bug fixes over an extended period of time. This will provide
> downstream consumers of DPDK with a stable target on which to base
> applications or packages.
[...]
> The proposed maintainer for the LTS is Yuanhan Liu
> .

I wonder if Yuanhan is OK to maintain every stable releases which could be
requested/needed? Or should we have other committers for the stable releases
that Yuanhan would not want to maintain himself?
The Linux model is to let people declare themselves when they want to maintain
a stable branch.

> The proposed duration of the LTS support is 2 years.

I think we should discuss the support duration for each release separately.

> There will only be one LTS branch being maintained at any time. At the end of
> the 2 year cycle the maintenance on the previous LTS will be wound down.

Seems a bit too restrictive.
Currently, there is no maintenance at all because nobody was volunteer.
If Yuanhan is volunteer for a stable branch every 2 years, fine.
If someone else is volunteer for other branches, why not let him do it?

> The proposed initial LTS version will be DPDK 16.07. The next versions, based
> on a 2 year cycle, will be DPDK 18.08, 20.08, etc.

Let's do a first run with 16.07 and see later what we want to do next.
How long time a stable branch must be announced before its initial release?

> What changes should be backported
> -
> 
> * Bug fixes that don't break the ABI.

And API?
And behaviour (if not clearly documented in the API)?

[...]
> Developers submitting fixes to the mainline should also CC the maintainer so
> that they can evaluate the patch. A  email address could 
> be
> provided for this so that it can be included as a CC in the commit messages
> and documented in the Code Contribution Guidelines.

Why?
We must avoid putting too much restrictions on the contributors.

> Intel will provide validation engineers to test the LTS branch/tree. Tested
> releases can be marked using a Git tag with an incremented revision number. 
> For
> example: 16.07.00_LTS -> 16.07.01_LTS. The testing cadence should be quarterly
> but will be best effort only and dependent on available resources.

Thanks
It must not be just a tag. There should be an announce and a tarball ready
to download.

[...]
> In order to reduce the testing effort the number of OSes which will be
> officially validated should be as small as possible. The proposal is that the
> following long term OSes are used for validation:
> 
> (OSV reps please confirm.)
> 
> * Ubuntu 16.04 LTS
> * RHEL 7.3
> * SuSE 11 SP4 or 12
> * FreeBSD 10.3

I'm sure there will be more validation on the field or from contributors.

[...]
> The LTS guidelines shall be reviewed after 1 year to adjust for any 
> experiences
> from LTS maintainership.

Yes seems very reasonnable.
Thanks

[dpdk-dev] [PATCH v2] ivshmem: add all memzones of mempool to metadata

2016-06-03 Thread Ferruh Yigit

Mempool consist of multiple memzones, at least from two of them.
ivshmem assumes mempool and elements are all in same memzone.

Updating code to add all memzones when a mempool added.

Fixes: d1d914ebbc25 ("mempool: allocate in several memory chunks by
default")

Signed-off-by: Ferruh Yigit 
Acked-by: Anatoly  Burakov 
Acked-by: Olivier Matz 

---

v2:
* fix patch title metada -> metadata
---
 lib/librte_ivshmem/rte_ivshmem.c | 30 ++
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/lib/librte_ivshmem/rte_ivshmem.c b/lib/librte_ivshmem/rte_ivshmem.c
index c8b332c..5c83920 100644
--- a/lib/librte_ivshmem/rte_ivshmem.c
+++ b/lib/librte_ivshmem/rte_ivshmem.c
@@ -548,25 +548,39 @@ add_ring_to_metadata(const struct rte_ring * r,
 }

 static int
-add_mempool_to_metadata(const struct rte_mempool * mp,
-   struct ivshmem_config * config)
+add_mempool_memzone_to_metadata(const void *addr,
+   struct ivshmem_config *config)
 {
-   struct rte_memzone * mz;
-   int ret;
+   struct rte_memzone *mz;

-   mz = get_memzone_by_addr(mp);
-   ret = 0;
+   mz = get_memzone_by_addr(addr);

if (!mz) {
RTE_LOG(ERR, EAL, "Cannot find memzone for mempool!\n");
return -1;
}

-   /* mempool consists of memzone and ring */
-   ret = add_memzone_to_metadata(mz, config);
+   return add_memzone_to_metadata(mz, config);
+}
+
+static int
+add_mempool_to_metadata(const struct rte_mempool *mp,
+   struct ivshmem_config *config)
+{
+   struct rte_mempool_memhdr *memhdr;
+   int ret;
+
+   ret = add_mempool_memzone_to_metadata(mp, config);
if (ret < 0)
return -1;

+   STAILQ_FOREACH(memhdr, >mem_list, next) {
+   ret = add_mempool_memzone_to_metadata(memhdr->addr, config);
+   if (ret < 0)
+   return -1;
+   }
+
+   /* mempool consists of memzone and ring */
return add_ring_to_metadata(mp->ring, config);
 }

-- 
2.5.5

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Arnon Warshavsky

On Fri, Jun 3, 2016 at 3:53 PM, Panu Matilainen  wrote:

> On 06/03/2016 03:01 PM, Arnon Warshavsky wrote:
>
>>
>>
>> On Fri, Jun 3, 2016 at 2:50 PM, Neil Horman > > wrote:
>>
>> On Fri, Jun 03, 2016 at 12:01:30PM +0100, Bruce Richardson wrote:
>> > On Fri, Jun 03, 2016 at 11:29:43AM +0100, Bruce Richardson wrote:
>> > > On Thu, Jun 02, 2016 at 04:08:37PM -0400, Neil Horman wrote:
>> > > > On Thu, Jun 02, 2016 at 07:41:10PM +, Wiles, Keith wrote:
>> > > > >
>> > > > > On 6/2/16, 12:11 PM, "Neil Horman" > > wrote:
>> > > > >
>> > > > > >
>> > > > > >1) The definition of a config structure that can be passed
>> to rte_eal_init,
>> > > > > >defining the configuration for that running process
>> > > > >
>> > > > > Having a configuration structure means we have to have an
>> ABI change to that structure anytime we add or remove an option. I
>> was thinking a very simple DB of some kind would be better. Have the
>> code query the DB to obtain the needed information. The APIs used to
>> query and set the DB needs to be very easy to use as well.
>> > > >
>> > > > Thats a fair point.  A decent starting point is likely a
>> simple struct that
>> > > > looks like this:
>> > > >
>> > > > struct key_vals {
>> > > >   char *key;
>> > > >   union {
>> > > >   ulong longval;
>> > > >   void *ptrval;
>> > > >   } value;
>> > > > };
>> > > >
>> > > > struct config {
>> > > >   size_t count;
>> > > >   struct key_vals kvp[0];
>> > > > };
>> > > >
>> > > > >
>> > > > > Maybe each option can define its own structure if needed or
>> just a simple variable type can be used for the basic types (int,
>> string, bool, ?)
>> > > > >
>> > > > Well, if you have config sections that require mulitiple
>> elements, I'd handle
>> > > > that with naming, i.e. if you have a config group that has an
>> int and char
>> > > > value, I'd name them "group.intval", and "group.charval", so
>> they are
>> > > > independently searchable, but linked from a nomenclature
>> standpoint.
>> > > >
>> > > > > Would this work better in the long run, does a fixed
>> structure still make sense?
>> > > > >
>> > > > No. I think you're ABI concerns are valid, but the above is
>> likely a good
>> > > > starting point to address them.
>> > > >
>> > > > Best
>> > > > Neil
>> > >
>> > > I'll throw out one implementation idea here that I looked at
>> previously, for
>> > > the reason that it was simple enough implement with existing code.
>> > >
>> > > We already have the cfgfile library which works with name/value
>> pairs read from
>> > > ini files on disk. However, it would be easy enough to add
>> couple of APIs to
>> > > that to allow the user to "set" values inside an ini structure
>> as well. With
>> > > that done we can then just add a new eal_init api which takes a
>> single
>> > > "struct rte_cfgfile *" as parameter. For those apps that want to
>> just use
>> > > inifiles for configuration straight, they can then do:
>> > >
>> > > cfg = rte_cfgfile_load("my_cfg_file");
>> > > rte_eal_newinit(cfg);
>> > >
>> > > Those who want a different config can instead do:
>> > >
>> > > cfg = rte_cfgfile_new();
>> > > rte_cfgfile_add_section(cfg, "dpdk");
>> > > foreach_eal_setting_wanted:
>> > > rte_cfgfile_set(cfg, "dpdk", mysetting, myvalue);
>> > > rte_eal_newinit(cfg);
>> > >
>> > From chatting to a couple of other DPDK dev's here I suspect I may
>> not have
>> > been entirely clear here with this example. What is being shown
>> above is building
>> > up a "config-file" in memory - or rather a config structure which
>> happens to
>> > have the idea of sections and values as an ini file has. There is
>> no actual
>> > file ever being written to disk, and for those using any non-ini
>> config file
>> > structure for their app, the code overhead of using the APIs above
>> should be
>> > pretty much the same as building up any other set of key-value
>> pairs in
>> > memory to pass to an init function.
>> >
>> > Hope this is a little clearer now.
>> >
>> I'm fine with the idea of reusing the config file library that
>> currently exists,
>> or more to the point, modifying it to be usable as a configuration
>> API, rather
>> than a configuration file parser.  My primary interest is in
>> separating the user
>> configuration mechanism from the internal library configuration lookup
>> mechanism.  What I would really like to be able to see is
>> application developers
>> have the flexibiilty to choose their own configuration method

[dpdk-dev] [PATCH] ivshmem: add all memzones of mempool to metada

2016-06-03 Thread Ferruh Yigit

On 6/3/2016 12:05 PM, Ferruh Yigit wrote:
> On 6/2/2016 8:04 AM, Olivier MATZ wrote:
>> Hi Ferruh,
>>
>> Thank you for fixing this issue.
>>
>> On 06/01/2016 03:18 PM, Ferruh Yigit wrote:
>>> [PATCH] ivshmem: add all memzones of mempool to metada
>>
>> Minor comment: it seems the title is truncated
>>
> 
> Right, I will fix in next version of patch.
> 
>>> +static int
>>> +add_mempool_to_metadata(const struct rte_mempool *mp,
>>> +   struct ivshmem_config *config)
>>> +{
>>> +   struct rte_mempool_memhdr *memhdr;
>>> +   int ret;
>>> +
>>> +   ret = add_mempool_memzone_to_metadata(mp, config);
>>> if (ret < 0)
>>> return -1;
>>>  
>>> +   STAILQ_FOREACH(memhdr, >mem_list, next) {
>>> +   ret = add_mempool_memzone_to_metadata(memhdr->addr, config);
>>> +   if (ret < 0)
>>> +   return -1;
>>> +   }
>>> +
>>> +   /* mempool consists of memzone and ring */
>>> return add_ring_to_metadata(mp->ring, config);
>>>  }
>>>  
>>
>> In case you missed it: there is a function
>> rte_mempool_mem_iter() that can be used to browse the
>> memory chunks of a mempool. It's probably less convenient
>> to use compared to directly browsing the list, but it
>> may be more resistant to api changes.
> 
> I wasn't aware rte_mempool_mem_iter(), I will update the patch to use this.
> 

Although I prefer using the rte_mempool_mem_iter(); it is not
straightforward because of the const qualifier of mp, and
rte_mempool_mem_iter() doesn't accepts the const type, and casting const
to non-const causes a compile warning:
_"error: cast discards ?const? qualifier from pointer target type"_

An option is duplicating mp, (or part of it mp->mem_list) but that is
ugly and can cause potentials defects in the future, another option is
to use uint64_t to do pointer conversion, even uglier :), so I don't
want to use this API for this case.

So, I will only update the patch title and send a v2.

>>
>> Apart from that:
>> Acked-by: Olivier Matz 
>>
>> Thanks
>>
> 
> Thanks,
> ferruh
>

[dpdk-dev] [PATCH v7 1/5] mempool: support external mempool operations

2016-06-03 Thread Jerin Jacob

On Fri, Jun 03, 2016 at 11:28:14AM +0100, Hunt, David wrote:
> > 
> > > static inline struct rte_mempool_ops *
> > > rte_mempool_ops_get(int ops_index)
> > > 
> > > return _mempool_ops_table.ops[ops_index];
> > |> 2) Considering "get" and "put" are the fast-path callbacks for
> > |> pool-manger, Is it possible to avoid the extra overhead of the
> > |> following
> > |> _load_ and additional cache line on each call,
> > |> rte_mempool_handler_table.handler[handler_idx]
> > |>
> > |> I understand it is for multiprocess support but I am thing can we
> > |> introduce something like ethernet API support for multiprocess and
> > |> resolve "put" and "get" functions pointer on init and store in
> > |> struct mempool. Some thinking like
> > |>
> > |> file: drivers/net/ixgbe/ixgbe_ethdev.c
> > |> search for if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> > |
> > |I'll look at this one before posting the next version of the patch
> > |(soon). :)
> > 
> > Have you checked the above comment, if it difficult then postpone it. But
> > IMO it will reduce few cycles in fast-path and reduce the cache usage in
> > fast path
> > 
> > [snip]
> 
> I looked at it, but I'd need to do some more digging into it to see how it
> can be
> done properly. I'm not seeing any performance drop at the moment, and it may
> be a good way to improve further down the line. Is it OK to postpone?

I am OK for fixing it later. Performance issue should come in the use
cases where mempool "local cache" gets overflows and "get" and "put."
function pointers being used. Like crypto and ethdev, fast path function
pointers can accommodate in the main structure itself rather than one
more indirection.

Jerin

[dpdk-dev] [PATCH v7 5/5] mbuf: allow apps to change default mempool ops

2016-06-03 Thread Olivier Matz



On 06/03/2016 04:06 PM, Hunt, David wrote:
> 
> 
> On 6/3/2016 1:28 PM, Olivier MATZ wrote:
>>> [PATCH v7 5/5] mbuf: allow apps to change default mempool ops
>> Should the title be fixed?
>> I don't feel this allows application to change the default ops.
> 
> Allow _user_ to change default mempool ops, I think.

make default mempool ops configurable at build
 ?

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Wiles, Keith



Regards,
Keith



On 6/3/16, 11:04 AM, "dev on behalf of Wiles, Keith"  wrote:

>Sorry, I deleted all of the text as it was getting a bit long.
>
>Here are my thoughts as of now, which is a combination of many suggestions I 
>read from everyone?s emails. I hope this is not too hard to understand.
>
>- Break out the current command line options out of the DPDK common code and 
>move into a new lib.
>  - At this point I was thinking of keeping the rte_eal_init(args, argv) API 
> and just have it pass the args/argv to the new lib to create the data storage.
> - Maybe move the rte_eal_init() API to the new lib or keep it in the 
> common eal code. Do not want to go hog wild.
>  - The rte_eal_init(args, argv) would then call to the new API 
> rte_eal_initialize(void), which in turn queries the data storage. (still 
> thinking here)
>  - The example apps args needs to be passed to the examples as is for now, 
> then we can convert them one at a time if needed.
>
>- I would like to keep the storage of the data separate from the file parser 
>as they can use the ?set? routines to build the data storage up.
>  - Keeping them split allows for new parsers to be created, while keeping the 
> data storage from changing.
>- The rte_cfg code could be modified to use the new configuration if someone 
>wants to take on that task ?
>
>- Next is the data storage and how we can access the data in a clean simple 
>way.
>- I want to have some simple level of hierarchy in the data.
>  - Having a string containing at least two levels ?primary:secondary?.
> - Primary string is something like ?EAL? or ?Pktgen? or ?testpmd? to 
> divide the data storage into logical major groups.
>- The primary allows us to have groups and then we can have common 
> secondary strings in different groups if needed.
> - Secondary string can be whatever the developer of that group would like 
> e.g. simple ?EAL:foobar?, two levels ?testpmd:foo.bar?
>
>  - The secondary string is treated as a single string if it has a hierarchy 
> or not, but referencing a single value in the data storage.
> - Key value pairs (KVP) or a hashmap data store.
>- The key here is the whole string ?EAL:foobar? not just ?foobar? 
> secondary string.
>   - If we want to have the two split I am ok with that as well 
> meaning the API would be:
> rte_map_get(mapObj, ?EAL?, ?foo.bar?);
> rte_map_set(mapObj, ?EAL?, ?foo.bar?, value);
>   - Have the primary as a different section in the data store, would 
> allow for dumping that section maybe easier, not sure.
>  - I am leaning toward
A single string, but let me know your thoughts.
> - Not going to try splitting up the string or parse it as it is up to the 
> developer to make it unique in the data store.
>- Use a code design to make the strings simple to use without having typos be 
>a problem.
>   - Not sure what the design is yet, but I do not want to have to concat two 
> string or split strings in the code.
>
>This is as far as I have gotten and got tired of typing ?
>
>I hope this will satisfy most everyone?s needs for now.
>
>
>Regards,
>Keith
>
>
>
>

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Yerden Zhumabekov

+1

We're using INI in our app, turned out to be quite simple, like this:

[eal]
;; EAL common options:
;;   -c COREMASK Hexadecimal bitmask of cores to run on
# coremask = fff

;;   -l CORELIST List of cores to run on
corelist = 3,4,5

;;   --lcores COREMAPMap lcore set to physical cpu set
; coremap =

;;   --master-lcore ID   Core ID that is used as master
; master-lcore-id = 0

;;   -n CHANNELS Number of memory channels
memory-channels = 4


On 01.06.2016 22:18, Bruce Richardson wrote:
> On Wed, Jun 01, 2016 at 10:58:41AM -0500, Jay Rolette wrote:
>> On Wed, Jun 1, 2016 at 10:00 AM, Wiles, Keith  
>> wrote:
>>
>>> Started from the link below, but did not want to highjack the thread.
>>> http://dpdk.org/ml/archives/dev/2016-June/040021.html
>>>
>>> I was thinking about this problem from a user perspective and command line
>>> options are very difficult to manage specifically when you have a large
>>> number of options as we have in dpdk. I see all of these options as a type
>>> of database of information for the DPDK and the application, because the
>>> application command line options are also getting very complex as well.
>>>
>>> I have been looking at a number of different options here and the
>>> direction I was thinking was using a file for the options and
>>> configurations with the data in a clean format. It could have been a INI
>>> file or JSON or XML, but they all seem to have some problems I do not like.
>>> The INI file is too flat and I wanted a hierarchy in the data, the JSON
>>> data is similar and XML is just hard to read. I wanted to be able to manage
>>> multiple applications and possible system the DPDK/app runs. The problem
>>> with the above formats is they are just data and not easy to make decisions
>>> about the system and applications at runtime.
>>>
>> INI format is simplest for users to read, but if you really need hierarchy,
>> JSON will do that just fine. Not sure what you mean by "JSON data is
>> similar"...
>>
>>
> I'd be quite concerned if we start needing lots of hierarchies for 
> configuration.
>
> I'd really just like to see ini file format used for this because:
> * it's a well understood, simple format
> * very easily human readable and editable
> * lots of support for it in lots of languages
> * hierarchies are possible in it too - just not as easy as in other formats
>though. [In a previous life I worked with ini files which had address
>hierarchies 6-levels deep in them. It wasn't hard to work with]
> * it works well with grep since you must have one value per-line
> * it allows comments
> * we already have a DPDK library for parsing them
>
> However, for me the biggest advantage of using something like ini is that it
> would force us to keep things simple!
>
> I'd stay away from formats like json or XML that are designed for serializing
> entire objects or structures, and look for something that allows us to just
> specify configuration values.
>
> Regards,
> /Bruce
>

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Wiles, Keith

Sorry, I deleted all of the text as it was getting a bit long.

Here are my thoughts as of now, which is a combination of many suggestions I 
read from everyone?s emails. I hope this is not too hard to understand.

- Break out the current command line options out of the DPDK common code and 
move into a new lib.
  - At this point I was thinking of keeping the rte_eal_init(args, argv) API 
and just have it pass the args/argv to the new lib to create the data storage.
 - Maybe move the rte_eal_init() API to the new lib or keep it in the 
common eal code. Do not want to go hog wild.
  - The rte_eal_init(args, argv) would then call to the new API 
rte_eal_initialize(void), which in turn queries the data storage. (still 
thinking here)
  - The example apps args needs to be passed to the examples as is for now, 
then we can convert them one at a time if needed.

- I would like to keep the storage of the data separate from the file parser as 
they can use the ?set? routines to build the data storage up.
  - Keeping them split allows for new parsers to be created, while keeping the 
data storage from changing.
- The rte_cfg code could be modified to use the new configuration if someone 
wants to take on that task ?

- Next is the data storage and how we can access the data in a clean simple way.
- I want to have some simple level of hierarchy in the data.
  - Having a string containing at least two levels ?primary:secondary?.
 - Primary string is something like ?EAL? or ?Pktgen? or ?testpmd? to 
divide the data storage into logical major groups.
- The primary allows us to have groups and then we can have common 
secondary strings in different groups if needed.
 - Secondary string can be whatever the developer of that group would like 
e.g. simple ?EAL:foobar?, two levels ?testpmd:foo.bar?

  - The secondary string is treated as a single string if it has a hierarchy or 
not, but referencing a single value in the data storage.
 - Key value pairs (KVP) or a hashmap data store.
- The key here is the whole string ?EAL:foobar? not just ?foobar? 
secondary string.
   - If we want to have the two split I am ok with that as well meaning 
the API would be:
 rte_map_get(mapObj, ?EAL?, ?foo.bar?);
 rte_map_set(mapObj, ?EAL?, ?foo.bar?, value);
   - Have the primary as a different section in the data store, would 
allow for dumping that section maybe easier, not sure.
  - I am leaning toward
 - Not going to try splitting up the string or parse it as it is up to the 
developer to make it unique in the data store.
- Use a code design to make the strings simple to use without having typos be a 
problem.
   - Not sure what the design is yet, but I do not want to have to concat two 
string or split strings in the code.

This is as far as I have gotten and got tired of typing ?

I hope this will satisfy most everyone?s needs for now.


Regards,
Keith

[dpdk-dev] [PATCH 1/2] qede: rename config option

2016-06-03 Thread Bruce Richardson

On Mon, May 30, 2016 at 03:38:41PM +0100, Bruce Richardson wrote:
> On Fri, May 06, 2016 at 09:21:30PM -0700, Rasesh Mody wrote:
> > Rename RTE_LIBRTE_QEDE_DEBUG_DRV to RTE_LIBRTE_QEDE_DEBUG_DRIVER
> > 
> > Fixes: 425cba2a5176 ("qede: enable PMD build")
> > Fixes: 33e9ff1b72ca ("qede: add core driver")
> > 
> These fixes lines don't have the correct commit id's in them. The commit
> "qede: enable PMD build" is actually commit 3eae93a9, and "qede: add core
> driver" is 2ea6f76a. The commit id of a patch can change from what it is
> originally when the patch is applied.
> 
> /Bruce
> 
Patch series applied to dpdk-next-net/rel_16_07 with corrected commit ids in
the fixes line, and some reworking of titles and commit messages as discussed.

Regards,
/Bruce

[dpdk-dev] [PATCH v8 1/3] mempool: support external mempool operations

2016-06-03 Thread David Hunt

Until now, the objects stored in a mempool were internally stored in a
ring. This patch introduces the possibility to register external handlers
replacing the ring.

The default behavior remains unchanged, but calling the new function
rte_mempool_set_handler() right after rte_mempool_create_empty() allows
the user to change the handler that will be used when populating
the mempool.

This patch also adds a set of default ops (function callbacks) based
on rte_ring.

Signed-off-by: Olivier Matz 
Signed-off-by: David Hunt 
---
 app/test/test_mempool_perf.c |   1 -
 lib/librte_mempool/Makefile  |   2 +
 lib/librte_mempool/rte_mempool.c |  73 -
 lib/librte_mempool/rte_mempool.h | 247 ---
 lib/librte_mempool/rte_mempool_default.c | 157 
 lib/librte_mempool/rte_mempool_ops.c | 149 +++
 6 files changed, 562 insertions(+), 67 deletions(-)
 create mode 100644 lib/librte_mempool/rte_mempool_default.c
 create mode 100644 lib/librte_mempool/rte_mempool_ops.c

diff --git a/app/test/test_mempool_perf.c b/app/test/test_mempool_perf.c
index cdc02a0..091c1df 100644
--- a/app/test/test_mempool_perf.c
+++ b/app/test/test_mempool_perf.c
@@ -161,7 +161,6 @@ per_lcore_mempool_test(__attribute__((unused)) void *arg)
   n_get_bulk);
if (unlikely(ret < 0)) {
rte_mempool_dump(stdout, mp);
-   rte_ring_dump(stdout, mp->ring);
/* in this case, objects are lost... */
return -1;
}
diff --git a/lib/librte_mempool/Makefile b/lib/librte_mempool/Makefile
index 43423e0..8cac29b 100644
--- a/lib/librte_mempool/Makefile
+++ b/lib/librte_mempool/Makefile
@@ -42,6 +42,8 @@ LIBABIVER := 2

 # all source are stored in SRCS-y
 SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool.c
+SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_ops.c
+SRCS-$(CONFIG_RTE_LIBRTE_MEMPOOL) +=  rte_mempool_default.c
 # install includes
 SYMLINK-$(CONFIG_RTE_LIBRTE_MEMPOOL)-include := rte_mempool.h

diff --git a/lib/librte_mempool/rte_mempool.c b/lib/librte_mempool/rte_mempool.c
index b54de43..eb74e25 100644
--- a/lib/librte_mempool/rte_mempool.c
+++ b/lib/librte_mempool/rte_mempool.c
@@ -148,7 +148,7 @@ mempool_add_elem(struct rte_mempool *mp, void *obj, 
phys_addr_t physaddr)
 #endif

/* enqueue in ring */
-   rte_ring_sp_enqueue(mp->ring, obj);
+   rte_mempool_ops_enqueue_bulk(mp, , 1);
 }

 /* call obj_cb() for each mempool element */
@@ -303,40 +303,6 @@ rte_mempool_xmem_usage(__rte_unused void *vaddr, uint32_t 
elt_num,
return (size_t)paddr_idx << pg_shift;
 }

-/* create the internal ring */
-static int
-rte_mempool_ring_create(struct rte_mempool *mp)
-{
-   int rg_flags = 0, ret;
-   char rg_name[RTE_RING_NAMESIZE];
-   struct rte_ring *r;
-
-   ret = snprintf(rg_name, sizeof(rg_name),
-   RTE_MEMPOOL_MZ_FORMAT, mp->name);
-   if (ret < 0 || ret >= (int)sizeof(rg_name))
-   return -ENAMETOOLONG;
-
-   /* ring flags */
-   if (mp->flags & MEMPOOL_F_SP_PUT)
-   rg_flags |= RING_F_SP_ENQ;
-   if (mp->flags & MEMPOOL_F_SC_GET)
-   rg_flags |= RING_F_SC_DEQ;
-
-   /* Allocate the ring that will be used to store objects.
-* Ring functions will return appropriate errors if we are
-* running as a secondary process etc., so no checks made
-* in this function for that condition.
-*/
-   r = rte_ring_create(rg_name, rte_align32pow2(mp->size + 1),
-   mp->socket_id, rg_flags);
-   if (r == NULL)
-   return -rte_errno;
-
-   mp->ring = r;
-   mp->flags |= MEMPOOL_F_RING_CREATED;
-   return 0;
-}
-
 /* free a memchunk allocated with rte_memzone_reserve() */
 static void
 rte_mempool_memchunk_mz_free(__rte_unused struct rte_mempool_memhdr *memhdr,
@@ -354,7 +320,7 @@ rte_mempool_free_memchunks(struct rte_mempool *mp)
void *elt;

while (!STAILQ_EMPTY(>elt_list)) {
-   rte_ring_sc_dequeue(mp->ring, );
+   rte_mempool_ops_dequeue_bulk(mp, , 1);
(void)elt;
STAILQ_REMOVE_HEAD(>elt_list, next);
mp->populated_size--;
@@ -383,13 +349,16 @@ rte_mempool_populate_phys(struct rte_mempool *mp, char 
*vaddr,
unsigned i = 0;
size_t off;
struct rte_mempool_memhdr *memhdr;
-   int ret;

/* create the internal ring if not already done */
-   if ((mp->flags & MEMPOOL_F_RING_CREATED) == 0) {
-   ret = rte_mempool_ring_create(mp);
-   if (ret < 0)
-   return ret;
+   if ((mp->flags & MEMPOOL_F_POOL_CREATED) == 0) {
+   rte_errno = 0;
+

[dpdk-dev] [PATCH v8 0/5] mempool: add external mempool manager

2016-06-03 Thread David Hunt

Here's the latest version of the External Mempool Manager patchset.
It's re-based on top of the latest head as of 19/5/2016, including
Olivier's 35-part patch series on mempool re-org [1]

[1] http://dpdk.org/ml/archives/dev/2016-May/039229.html

v8 changes:

 * merged first three patches in the series into one.
 * changed parameters to ops callback to all be rte_mempool pointer
   rather than than pointer to opaque data or uint64.
 * comment fixes.
 * fixed parameter to _free function (was inconsistent).
 * changed MEMPOOL_F_RING_CREATED to MEMPOOL_F_POOL_CREATED

v7 changes:

 * Changed rte_mempool_handler_table to rte_mempool_ops_table
 * Changed hander_idx to ops_index in rte_mempool struct
 * Reworked comments in rte_mempool.h around ops functions
 * Changed rte_mempool_hander.c to rte_mempool_ops.c
 * Changed all functions containing _handler_ to _ops_
 * Now there is no mention of 'handler' left
 * Other small changes out of review of mailing list

v6 changes:

 * Moved the flags handling from rte_mempool_create_empty to
   rte_mempool_create, as it's only there for backward compatibility
 * Various comment additions and cleanup
 * Renamed rte_mempool_handler to rte_mempool_ops
 * Added a union for *pool and u64 pool_id in struct rte_mempool
 * split the original patch into a few parts for easier review.
 * rename functions with _ext_ to _ops_.
 * addressed review comments
 * renamed put and get functions to enqueue and dequeue
 * changed occurences of rte_mempool_ops to const, as they
   contain function pointers (security)
 * split out the default external mempool handler into a separate
   patch for easier review

v5 changes:
 * rebasing, as it is dependent on another patch series [1]

v4 changes (Olivier Matz):
 * remove the rte_mempool_create_ext() function. To change the handler, the
   user has to do the following:
   - mp = rte_mempool_create_empty()
   - rte_mempool_set_handler(mp, "my_handler")
   - rte_mempool_populate_default(mp)
   This avoids to add another function with more than 10 arguments, duplicating
   the doxygen comments
 * change the api of rte_mempool_alloc_t: only the mempool pointer is required
   as all information is available in it
 * change the api of rte_mempool_free_t: remove return value
 * move inline wrapper functions from the .c to the .h (else they won't be
   inlined). This implies to have one header file (rte_mempool.h), or it
   would have generate cross dependencies issues.
 * remove now unused MEMPOOL_F_INT_HANDLER (note: it was misused anyway due
   to the use of && instead of &)
 * fix build in debug mode (__MEMPOOL_STAT_ADD(mp, put_pool, n) remaining)
 * fix build with shared libraries (global handler has to be declared in
   the .map file)
 * rationalize #include order
 * remove unused function rte_mempool_get_handler_name()
 * rename some structures, fields, functions
 * remove the static in front of rte_tailq_elem rte_mempool_tailq (comment
   from Yuanhan)
 * test the ext mempool handler in the same file than standard mempool tests,
   avoiding to duplicate the code
 * rework the custom handler in mempool_test
 * rework a bit the patch selecting default mbuf pool handler
 * fix some doxygen comments

v3 changes:
 * simplified the file layout, renamed to rte_mempool_handler.[hc]
 * moved the default handlers into rte_mempool_default.c
 * moved the example handler out into app/test/test_ext_mempool.c
 * removed is_mc/is_mp change, slight perf degredation on sp cached operation
 * removed stack hanler, may re-introduce at a later date
 * Changes out of code reviews

v2 changes:
 * There was a lot of duplicate code between rte_mempool_xmem_create and
   rte_mempool_create_ext. This has now been refactored and is now
   hopefully cleaner.
 * The RTE_NEXT_ABI define is now used to allow building of the library
   in a format that is compatible with binaries built against previous
   versions of DPDK.
 * Changes out of code reviews. Hopefully I've got most of them included.

The External Mempool Manager is an extension to the mempool API that allows
users to add and use an external mempool manager, which allows external memory
subsystems such as external hardware memory management systems and software
based memory allocators to be used with DPDK.

The existing API to the internal DPDK mempool manager will remain unchanged
and will be backward compatible. However, there will be an ABI breakage, as
the mempool struct is changing. These changes are all contained withing
RTE_NEXT_ABI defs, and the current or next code can be changed with
the CONFIG_RTE_NEXT_ABI config setting

There are two aspects to external mempool manager.
  1. Adding the code for your new mempool operations (ops). This is
 achieved by adding a new mempool ops source file into the
 librte_mempool library, and using the REGISTER_MEMPOOL_HANDLER macro.
  2. Using the new API to call rte_mempool_create_empty and
 rte_mempool_set_ops to create a new mempool
 using the name

[dpdk-dev] [PATCH 1/2] qede: rename config option

2016-06-03 Thread Rasesh Mody

Thanks!

> From: Bruce Richardson [mailto:bruce.richardson at intel.com]
> Sent: Friday, June 03, 2016 8:02 AM
> 
> On Mon, May 30, 2016 at 03:38:41PM +0100, Bruce Richardson wrote:
> > On Fri, May 06, 2016 at 09:21:30PM -0700, Rasesh Mody wrote:
> > > Rename RTE_LIBRTE_QEDE_DEBUG_DRV to
> RTE_LIBRTE_QEDE_DEBUG_DRIVER
> > >
> > > Fixes: 425cba2a5176 ("qede: enable PMD build")
> > > Fixes: 33e9ff1b72ca ("qede: add core driver")
> > >
> > These fixes lines don't have the correct commit id's in them. The
> > commit
> > "qede: enable PMD build" is actually commit 3eae93a9, and "qede: add
> > core driver" is 2ea6f76a. The commit id of a patch can change from
> > what it is originally when the patch is applied.
> >
> > /Bruce
> >
> Patch series applied to dpdk-next-net/rel_16_07 with corrected commit ids
> in the fixes line, and some reworking of titles and commit messages as
> discussed.
> 
> Regards,
> /Bruce

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Panu Matilainen

On 06/03/2016 03:01 PM, Arnon Warshavsky wrote:
>
>
> On Fri, Jun 3, 2016 at 2:50 PM, Neil Horman  > wrote:
>
> On Fri, Jun 03, 2016 at 12:01:30PM +0100, Bruce Richardson wrote:
> > On Fri, Jun 03, 2016 at 11:29:43AM +0100, Bruce Richardson wrote:
> > > On Thu, Jun 02, 2016 at 04:08:37PM -0400, Neil Horman wrote:
> > > > On Thu, Jun 02, 2016 at 07:41:10PM +, Wiles, Keith wrote:
> > > > >
> > > > > On 6/2/16, 12:11 PM, "Neil Horman"  > wrote:
> > > > >
> > > > > >
> > > > > >1) The definition of a config structure that can be passed
> to rte_eal_init,
> > > > > >defining the configuration for that running process
> > > > >
> > > > > Having a configuration structure means we have to have an
> ABI change to that structure anytime we add or remove an option. I
> was thinking a very simple DB of some kind would be better. Have the
> code query the DB to obtain the needed information. The APIs used to
> query and set the DB needs to be very easy to use as well.
> > > >
> > > > Thats a fair point.  A decent starting point is likely a
> simple struct that
> > > > looks like this:
> > > >
> > > > struct key_vals {
> > > >   char *key;
> > > >   union {
> > > >   ulong longval;
> > > >   void *ptrval;
> > > >   } value;
> > > > };
> > > >
> > > > struct config {
> > > >   size_t count;
> > > >   struct key_vals kvp[0];
> > > > };
> > > >
> > > > >
> > > > > Maybe each option can define its own structure if needed or
> just a simple variable type can be used for the basic types (int,
> string, bool, ?)
> > > > >
> > > > Well, if you have config sections that require mulitiple
> elements, I'd handle
> > > > that with naming, i.e. if you have a config group that has an
> int and char
> > > > value, I'd name them "group.intval", and "group.charval", so
> they are
> > > > independently searchable, but linked from a nomenclature
> standpoint.
> > > >
> > > > > Would this work better in the long run, does a fixed
> structure still make sense?
> > > > >
> > > > No. I think you're ABI concerns are valid, but the above is
> likely a good
> > > > starting point to address them.
> > > >
> > > > Best
> > > > Neil
> > >
> > > I'll throw out one implementation idea here that I looked at
> previously, for
> > > the reason that it was simple enough implement with existing code.
> > >
> > > We already have the cfgfile library which works with name/value
> pairs read from
> > > ini files on disk. However, it would be easy enough to add
> couple of APIs to
> > > that to allow the user to "set" values inside an ini structure
> as well. With
> > > that done we can then just add a new eal_init api which takes a
> single
> > > "struct rte_cfgfile *" as parameter. For those apps that want to
> just use
> > > inifiles for configuration straight, they can then do:
> > >
> > > cfg = rte_cfgfile_load("my_cfg_file");
> > > rte_eal_newinit(cfg);
> > >
> > > Those who want a different config can instead do:
> > >
> > > cfg = rte_cfgfile_new();
> > > rte_cfgfile_add_section(cfg, "dpdk");
> > > foreach_eal_setting_wanted:
> > > rte_cfgfile_set(cfg, "dpdk", mysetting, myvalue);
> > > rte_eal_newinit(cfg);
> > >
> > From chatting to a couple of other DPDK dev's here I suspect I may
> not have
> > been entirely clear here with this example. What is being shown
> above is building
> > up a "config-file" in memory - or rather a config structure which
> happens to
> > have the idea of sections and values as an ini file has. There is
> no actual
> > file ever being written to disk, and for those using any non-ini
> config file
> > structure for their app, the code overhead of using the APIs above
> should be
> > pretty much the same as building up any other set of key-value
> pairs in
> > memory to pass to an init function.
> >
> > Hope this is a little clearer now.
> >
> I'm fine with the idea of reusing the config file library that
> currently exists,
> or more to the point, modifying it to be usable as a configuration
> API, rather
> than a configuration file parser.  My primary interest is in
> separating the user
> configuration mechanism from the internal library configuration lookup
> mechanism.  What I would really like to be able to see is
> application developers
> have the flexibiilty to choose their own configuration method and
> format, and
> programatically build a configuration for the dpdk on a per-instance
> basis prior
> to calling rte_eal_init
>
> It seems like this approach

[dpdk-dev] [PATCH v5 0/6] Virtio-net PMD: QEMU QTest extension for container

2016-06-03 Thread Thomas Monjalon

2016-06-03 12:17, Yuanhan Liu:
> On Thu, Jun 02, 2016 at 06:30:18PM +0900, Tetsuya Mukawa wrote:
> > Hi Yuanhan,
> > 
> > On 2016/06/02 16:31, Yuanhan Liu wrote:
> > > But still, I'd ask do we really need 2 virtio for container solutions?
> > 
> > I appreciate your comments.
> 
> No, I appreciate your effort for contributing to DPDK! vhost-pmd stuff
> is just brilliant!
> 
> > Let me have time to discuss it with our team.
> 
> I'm wondering could we have one solution only. IMO, the drawback of
> having two (quite different) solutions might outweighs the benefit
> it takes. Say, it might just confuse user.

+1

> OTOH, I'm wondering could you adapt to Jianfeng's solution? If not,
> what's the missing parts, and could we fix it? I'm thinking having
> one unified solution will keep ours energy/focus on one thing, making
> it better and better! Having two just splits the energy; it also
> introduces extra burden for maintaining.

+1

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Neil Horman

On Fri, Jun 03, 2016 at 07:07:50PM +, Wiles, Keith wrote:
> On 6/3/16, 2:00 PM, "dev on behalf of Wiles, Keith"  on behalf of keith.wiles at intel.com> wrote:
> 
> >On 6/3/16, 1:52 PM, "Arnon Warshavsky" mailto:arnon at 
> >qwilt.com>> wrote:
> >
> >
> >
> >On Fri, Jun 3, 2016 at 9:38 PM, Neil Horman  >tuxdriver.com> wrote:
> >On Fri, Jun 03, 2016 at 06:29:13PM +, Wiles, Keith wrote:
> >>
> >> On 6/3/16, 12:44 PM, "Neil Horman"  >> tuxdriver.com> wrote:
> >>
> >> >On Fri, Jun 03, 2016 at 04:04:14PM +, Wiles, Keith wrote:
> >> >> Sorry, I deleted all of the text as it was getting a bit long.
> >> >>
> >> >> Here are my thoughts as of now, which is a combination of many 
> >> >> suggestions I read from everyone?s emails. I hope this is not too hard 
> >> >> to understand.
> >> >>
> >> >> - Break out the current command line options out of the DPDK common 
> >> >> code and move into a new lib.
> >> >>   - At this point I was thinking of keeping the rte_eal_init(args, 
> >> >> argv) API and just have it pass the args/argv to the new lib to create 
> >> >> the data storage.
> >> >>  - Maybe move the rte_eal_init() API to the new lib or keep it in 
> >> >> the common eal code. Do not want to go hog wild.
> >> >>   - The rte_eal_init(args, argv) would then call to the new API 
> >> >> rte_eal_initialize(void), which in turn queries the data storage. 
> >> >> (still thinking here)
> >> >These three items seem to be the exact opposite of my suggestion.  The 
> >> >point of
> >> >this change was to segregate the parsing of configuration away from the
> >> >initalization dpdk using that configurtion.  By keeping rte_eal_init in 
> >> >such a
> >> >way that the command line is directly passed into it, you've not changed 
> >> >that
> >> >implicit binding to command line options.
> >>
> >> Neil,
> >>
> >> You maybe reading the above wrong or I wrote it wrong, which is a high 
> >> possibility. I want to move the command line parsing out of DPDK an into a 
> >> library, but I still believe I need to provide some backward compatibility 
> >> for ABI and to reduce the learning curve. The current applications can 
> >> still call the rte_eal_init(), which then calls the new lib parser for 
> >> dpdk command line options and then calls rte_eal_initialize() or move to 
> >> the new API rte_eal_initialize() preceded by a new library call to parse 
> >> the old command line args. At some point we can deprecate the 
> >> rte_eal_init() if we think it is reasonable.
> >>
> >> >
> >> >I can understand if you want to keep rte_eal_init as is for ABI purposes, 
> >> >but
> >> >then you should create an rte_eal_init2(foo), where foo is some handle to 
> >> >in
> >> >memory parsed configuration, so that applications can preform that 
> >> >separation.
> >>
> >> I think you describe what I had planned here. The rte_eal_initialize() 
> >> routine is the new rte_eal_init2() API and the rte_eal_init() was only for 
> >> backward compatibility was my thinking. I figured the argument to 
> >> rte_eal_initialize() would be something to be decided, but it will mostly 
> >> likely be some type of pointer to the storage.
> >>
> >> I hope that clears that up, but let me know.
> >>
> >yes, that clarifies your thinking, and I agree with it.  Thank you!
> >Neil
> >
> >> ++Keith
> >>
> >> >
> >> >Neil
> >> >
> >> >>   - The example apps args needs to be passed to the examples as is for 
> >> >> now, then we can convert them one at a time if needed.
> >> >>
> >> >> - I would like to keep the storage of the data separate from the file 
> >> >> parser as they can use the ?set? routines to build the data storage up.
> >> >>   - Keeping them split allows for new parsers to be created, while 
> >> >> keeping the data storage from changing.
> >> >> - The rte_cfg code could be modified to use the new configuration if 
> >> >> someone wants to take on that task ?
> >> >>
> >> >> - Next is the data storage and how we can access the data in a clean 
> >> >> simple way.
> >> >> - I want to have some simple level of hierarchy in the data.
> >> >>   - Having a string containing at least two levels ?primary:secondary?.
> >> >>  - Primary string is something like ?EAL? or ?Pktgen? or ?testpmd? 
> >> >> to divide the data storage into logical major groups.
> >> >> - The primary allows us to have groups and then we can have 
> >> >> common secondary strings in different groups if needed.
> >> >>  - Secondary string can be whatever the developer of that group 
> >> >> would like e.g. simple ?EAL:foobar?, two levels ?testpmd:foo.bar?
> >> >>
> >> >>   - The secondary string is treated as a single string if it has a 
> >> >> hierarchy or not, but referencing a single value in the data storage.
> >> >>  - Key value pairs (KVP) or a hashmap data store.
> >> >> - The key here is the whole string ?EAL:foobar? not just 
> >> >> ?foobar? secondary string.
> >> >>

[dpdk-dev] [PATCH v7 2/5] mempool: remove rte_ring from rte_mempool struct

2016-06-03 Thread Hunt, David



On 6/3/2016 1:28 PM, Olivier MATZ wrote:
>
> On 06/02/2016 03:27 PM, David Hunt wrote:
>> Now that we're moving to an external mempoool handler, which
>> uses a void *pool_data as a pointer to the pool data, remove the
>> unneeded ring pointer from the mempool struct.
>>
>> Signed-off-by: David Hunt 
>> ---
>>   app/test/test_mempool_perf.c | 1 -
>>   lib/librte_mempool/rte_mempool.h | 1 -
>>   2 files changed, 2 deletions(-)
>>
>> diff --git a/app/test/test_mempool_perf.c b/app/test/test_mempool_perf.c
>> index cdc02a0..091c1df 100644
>> --- a/app/test/test_mempool_perf.c
>> +++ b/app/test/test_mempool_perf.c
>> @@ -161,7 +161,6 @@ per_lcore_mempool_test(__attribute__((unused)) void *arg)
>> n_get_bulk);
>>  if (unlikely(ret < 0)) {
>>  rte_mempool_dump(stdout, mp);
>> -rte_ring_dump(stdout, mp->ring);
>>  /* in this case, objects are lost... */
>>  return -1;
>>  }
>> diff --git a/lib/librte_mempool/rte_mempool.h 
>> b/lib/librte_mempool/rte_mempool.h
>> index a6b28b0..c33eeb8 100644
>> --- a/lib/librte_mempool/rte_mempool.h
>> +++ b/lib/librte_mempool/rte_mempool.h
>> @@ -204,7 +204,6 @@ struct rte_mempool_memhdr {
>>*/
>>   struct rte_mempool {
>>  char name[RTE_MEMPOOL_NAMESIZE]; /**< Name of mempool. */
>> -struct rte_ring *ring;   /**< Ring to store objects. */
>>  union {
>>  void *pool_data; /**< Ring or pool to store objects */
>>  uint64_t pool_id;/**< External mempool identifier */
>>
> Sorry if I missed it in previous discussions, but I don't really
> see the point of having this in a separate commit, as the goal
> of the previous commit is to replace the ring by configurable ops.
>
> Moreover, after applying only the previous commit, the
> call to rte_ring_dump(stdout, mp->ring) would probably crash
> sine ring is NULL.
>
> I think this comment also applies to the next commit. Splitting
> between functionalities is good, but in this case I think the 3
> commits are linked together, and it should not break compilation
> or tests to facilitate the git bisect.
>
>
> Regards,
> Olivier

Yes. Originally there was a lot of discussion to split out the bigger 
patch, which I
did, and it was easier to review, but I think that now we're (very) 
close to final
revision, I can merge those three back into one.

Thanks,
Dave.

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Panu Matilainen

On 06/03/2016 02:50 PM, Neil Horman wrote:
> On Fri, Jun 03, 2016 at 12:01:30PM +0100, Bruce Richardson wrote:
>> On Fri, Jun 03, 2016 at 11:29:43AM +0100, Bruce Richardson wrote:
>>> On Thu, Jun 02, 2016 at 04:08:37PM -0400, Neil Horman wrote:
 On Thu, Jun 02, 2016 at 07:41:10PM +, Wiles, Keith wrote:
>
> On 6/2/16, 12:11 PM, "Neil Horman"  wrote:
>
>>
>> 1) The definition of a config structure that can be passed to 
>> rte_eal_init,
>> defining the configuration for that running process
>
> Having a configuration structure means we have to have an ABI change to 
> that structure anytime we add or remove an option. I was thinking a very 
> simple DB of some kind would be better. Have the code query the DB to 
> obtain the needed information. The APIs used to query and set the DB 
> needs to be very easy to use as well.

 Thats a fair point.  A decent starting point is likely a simple struct that
 looks like this:

 struct key_vals {
char *key;
union {
ulong longval;
void *ptrval;
} value;
 };

 struct config {
size_t count;
struct key_vals kvp[0];
 };

>
> Maybe each option can define its own structure if needed or just a simple 
> variable type can be used for the basic types (int, string, bool, ?)
>
 Well, if you have config sections that require mulitiple elements, I'd 
 handle
 that with naming, i.e. if you have a config group that has an int and char
 value, I'd name them "group.intval", and "group.charval", so they are
 independently searchable, but linked from a nomenclature standpoint.

> Would this work better in the long run, does a fixed structure still make 
> sense?
>
 No. I think you're ABI concerns are valid, but the above is likely a good
 starting point to address them.

 Best
 Neil
>>>
>>> I'll throw out one implementation idea here that I looked at previously, for
>>> the reason that it was simple enough implement with existing code.
>>>
>>> We already have the cfgfile library which works with name/value pairs read 
>>> from
>>> ini files on disk. However, it would be easy enough to add couple of APIs to
>>> that to allow the user to "set" values inside an ini structure as well. With
>>> that done we can then just add a new eal_init api which takes a single
>>> "struct rte_cfgfile *" as parameter. For those apps that want to just use
>>> inifiles for configuration straight, they can then do:
>>>
>>> cfg = rte_cfgfile_load("my_cfg_file");
>>> rte_eal_newinit(cfg);
>>>
>>> Those who want a different config can instead do:
>>>
>>> cfg = rte_cfgfile_new();
>>> rte_cfgfile_add_section(cfg, "dpdk");
>>> foreach_eal_setting_wanted:
>>> rte_cfgfile_set(cfg, "dpdk", mysetting, myvalue);
>>> rte_eal_newinit(cfg);
>>>
>> From chatting to a couple of other DPDK dev's here I suspect I may not have
>> been entirely clear here with this example. What is being shown above is 
>> building
>> up a "config-file" in memory - or rather a config structure which happens to
>> have the idea of sections and values as an ini file has. There is no actual
>> file ever being written to disk, and for those using any non-ini config file
>> structure for their app, the code overhead of using the APIs above should be
>> pretty much the same as building up any other set of key-value pairs in
>> memory to pass to an init function.

/me nods.

This is pretty much exactly what I suggested (only in much less detail) 
last year :) http://dpdk.org/ml/archives/dev/2015-October/024803.html

>> Hope this is a little clearer now.
> I'm fine with the idea of reusing the config file library that currently 
> exists,
> or more to the point, modifying it to be usable as a configuration API, rather
> than a configuration file parser.  My primary interest is in separating the 
> user
> configuration mechanism from the internal library configuration lookup
> mechanism.  What I would really like to be able to see is application 
> developers
> have the flexibiilty to choose their own configuration method and format, and
> programatically build a configuration for the dpdk on a per-instance basis 
> prior
> to calling rte_eal_init
>
> It seems like this approach satisfies that requirement

/me nods some more.

What the key-value config also can buy us is a direct mapping to cli 
options (which is something Keith has been looking into IIRC), at which 
point I think all the bases are quite nicely covered.

- Panu -

[dpdk-dev] [PATCH v7 5/5] mbuf: allow apps to change default mempool ops

2016-06-03 Thread Hunt, David



On 6/3/2016 3:10 PM, Olivier Matz wrote:
>
> On 06/03/2016 04:06 PM, Hunt, David wrote:
>>
>> On 6/3/2016 1:28 PM, Olivier MATZ wrote:
 [PATCH v7 5/5] mbuf: allow apps to change default mempool ops
>>> Should the title be fixed?
>>> I don't feel this allows application to change the default ops.
>> Allow _user_ to change default mempool ops, I think.
> make default mempool ops configurable at build
>   ?

Yup :)

[dpdk-dev] RFC: DPDK Long Term Support

2016-06-03 Thread Mcnamara, John

Introduction


This document sets out a proposal for a DPDK Long Term Support release (LTS).

The purpose of the DPDK LTS will be to maintain a stable release of DPDK with
backported bug fixes over an extended period of time. This will provide
downstream consumers of DPDK with a stable target on which to base
applications or packages.

As with previous DPDK guidelines this proposal is open for discussion within
the community. The consensus view will be included in the DPDK documentation
as a guideline.


LTS Maintainer
--

The proposed maintainer for the LTS is Yuanhan Liu
.


LTS Duration


The proposed duration of the LTS support is 2 years.

There will only be one LTS branch being maintained at any time. At the end of
the 2 year cycle the maintenance on the previous LTS will be wound down.


LTS Version


The proposed initial LTS version will be DPDK 16.07. The next versions, based
on a 2 year cycle, will be DPDK 18.08, 20.08, etc.


What changes should be backported
-

* Bug fixes that don't break the ABI.


What changes should not be backported
-

* API or ABI breaking changes.

* Features should not be backported. Unless:

   * There is a justifiable use case (for example a new PMD).
   * The change is non-invasive.
   * The work of preparing the backport is done by the proposer.
   * There is support within the community.


Role of the maintainer
--

* The maintainer will evaluate fixes to the DPDK master submitted by the
  fixing developer and apply them to the LTS branch/tree.

* The maintainer will evaluate backported patches from downstream consumers
  and apply them to the LTS branch/tree.

* The maintainer will not backport non-trivial fixes without assistance from
  the downstream consumers or requester.


Role of the downstream consumers


Developers submitting fixes to the mainline should also CC the maintainer so
that they can evaluate the patch. A  email address could be
provided for this so that it can be included as a CC in the commit messages
and documented in the Code Contribution Guidelines.

The downstream consumers (OSVs and DPDK dependent application and framework
developers) should identify issues in the field that have been fixed in the
mainline release and report them to the maintainer. They should, ideally,
assist with backporting any required fixes.


Testing
---

Intel will provide validation engineers to test the LTS branch/tree. Tested
releases can be marked using a Git tag with an incremented revision number. For
example: 16.07.00_LTS -> 16.07.01_LTS. The testing cadence should be quarterly
but will be best effort only and dependent on available resources.


Validated OSes
--

In order to reduce the testing effort the number of OSes which will be
officially validated should be as small as possible. The proposal is that the
following long term OSes are used for validation:

(OSV reps please confirm.)

* Ubuntu 16.04 LTS
* RHEL 7.3
* SuSE 11 SP4 or 12
* FreeBSD 10.3

Fixes for newer OSes, kernels (and associated KNI fixes), and newer GCC/Clang
versions can be backported but the validation effort will be limited to the
above platforms.


Release Notes
-

The LTS release notes should be updated to include a section with backported
fixes. Patches for backporting should include additions to the release notes
like patches to the mainline branch.


LTS Review
--

The LTS guidelines shall be reviewed after 1 year to adjust for any experiences
from LTS maintainership.

[dpdk-dev] [PATCH v7 5/5] mbuf: allow apps to change default mempool ops

2016-06-03 Thread Hunt, David



On 6/3/2016 1:28 PM, Olivier MATZ wrote:
>> [PATCH v7 5/5] mbuf: allow apps to change default mempool ops
> Should the title be fixed?
> I don't feel this allows application to change the default ops.

Allow _user_ to change default mempool ops, I think.

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Arnon Warshavsky

On Fri, Jun 3, 2016 at 2:50 PM, Neil Horman  wrote:

> On Fri, Jun 03, 2016 at 12:01:30PM +0100, Bruce Richardson wrote:
> > On Fri, Jun 03, 2016 at 11:29:43AM +0100, Bruce Richardson wrote:
> > > On Thu, Jun 02, 2016 at 04:08:37PM -0400, Neil Horman wrote:
> > > > On Thu, Jun 02, 2016 at 07:41:10PM +, Wiles, Keith wrote:
> > > > >
> > > > > On 6/2/16, 12:11 PM, "Neil Horman"  wrote:
> > > > >
> > > > > >
> > > > > >1) The definition of a config structure that can be passed to
> rte_eal_init,
> > > > > >defining the configuration for that running process
> > > > >
> > > > > Having a configuration structure means we have to have an ABI
> change to that structure anytime we add or remove an option. I was thinking
> a very simple DB of some kind would be better. Have the code query the DB
> to obtain the needed information. The APIs used to query and set the DB
> needs to be very easy to use as well.
> > > >
> > > > Thats a fair point.  A decent starting point is likely a simple
> struct that
> > > > looks like this:
> > > >
> > > > struct key_vals {
> > > >   char *key;
> > > >   union {
> > > >   ulong longval;
> > > >   void *ptrval;
> > > >   } value;
> > > > };
> > > >
> > > > struct config {
> > > >   size_t count;
> > > >   struct key_vals kvp[0];
> > > > };
> > > >
> > > > >
> > > > > Maybe each option can define its own structure if needed or just a
> simple variable type can be used for the basic types (int, string, bool, ?)
> > > > >
> > > > Well, if you have config sections that require mulitiple elements,
> I'd handle
> > > > that with naming, i.e. if you have a config group that has an int
> and char
> > > > value, I'd name them "group.intval", and "group.charval", so they are
> > > > independently searchable, but linked from a nomenclature standpoint.
> > > >
> > > > > Would this work better in the long run, does a fixed structure
> still make sense?
> > > > >
> > > > No. I think you're ABI concerns are valid, but the above is likely a
> good
> > > > starting point to address them.
> > > >
> > > > Best
> > > > Neil
> > >
> > > I'll throw out one implementation idea here that I looked at
> previously, for
> > > the reason that it was simple enough implement with existing code.
> > >
> > > We already have the cfgfile library which works with name/value pairs
> read from
> > > ini files on disk. However, it would be easy enough to add couple of
> APIs to
> > > that to allow the user to "set" values inside an ini structure as
> well. With
> > > that done we can then just add a new eal_init api which takes a single
> > > "struct rte_cfgfile *" as parameter. For those apps that want to just
> use
> > > inifiles for configuration straight, they can then do:
> > >
> > > cfg = rte_cfgfile_load("my_cfg_file");
> > > rte_eal_newinit(cfg);
> > >
> > > Those who want a different config can instead do:
> > >
> > > cfg = rte_cfgfile_new();
> > > rte_cfgfile_add_section(cfg, "dpdk");
> > > foreach_eal_setting_wanted:
> > > rte_cfgfile_set(cfg, "dpdk", mysetting, myvalue);
> > > rte_eal_newinit(cfg);
> > >
> > From chatting to a couple of other DPDK dev's here I suspect I may not
> have
> > been entirely clear here with this example. What is being shown above is
> building
> > up a "config-file" in memory - or rather a config structure which
> happens to
> > have the idea of sections and values as an ini file has. There is no
> actual
> > file ever being written to disk, and for those using any non-ini config
> file
> > structure for their app, the code overhead of using the APIs above
> should be
> > pretty much the same as building up any other set of key-value pairs in
> > memory to pass to an init function.
> >
> > Hope this is a little clearer now.
> >
> I'm fine with the idea of reusing the config file library that currently
> exists,
> or more to the point, modifying it to be usable as a configuration API,
> rather
> than a configuration file parser.  My primary interest is in separating
> the user
> configuration mechanism from the internal library configuration lookup
> mechanism.  What I would really like to be able to see is application
> developers
> have the flexibiilty to choose their own configuration method and format,
> and
> programatically build a configuration for the dpdk on a per-instance basis
> prior
> to calling rte_eal_init
>
> It seems like this approach satisfies that requirement
> Neil
>
>
If the there is no configuration structure , rather an opaque configuration
key/value store ,
the user applications can set and get knobs that are not seen by anyone
(library) that does not know them by name

e.g

int num_nodes = getCfgInt ( cfgObject , "eal" , "num_numa_nodes");
int delay = getCfgInt ( cfgObject , "drivers.ixgbe" , "some_delay");
setCfgInt (cfgObject , "my_app" , "num_days" , 7);
setCfgString (cfgObject , "my_app" , "best_day" , "Wednesday");

/Arnon

[dpdk-dev] about rx checksum flags

2016-06-03 Thread Olivier Matz

Hi,

On 06/02/2016 09:42 AM, Chandran, Sugesh wrote:
 Do you also suggest to drop IP checksum flags?
>>> > >
>>> > > IP checksum offload is mostly useless. If application needs to look at
>>> > > IP, it can do whole checksum in very few instructions, the whole
>>> > > header is in the same cache line as src/dst so the HW offload is really 
>>> > > no
>> > help.
>>> > >
> [Sugesh] The checksum offload can boost the tunneling performance in OVS.
> I guess the IP checksum also important as L4. In some cases, UDP checksum is
> zero and no need to validate it. But Ip checksum is present on all the 
> packets and that must be
> validated all  the time. At higher packet rate, the ip checksum offload can 
> offer slight 
> performance improvement. What do you think??
> 

Agree, in some situations (and this is even more true with packet
types / smartnics), the application could process without accessing
the packet data if we keep the IP cksum flags.

Regards,
Olivier

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Matthew Hall

On Fri, Jun 03, 2016 at 07:23:39PM +, Wiles, Keith wrote:
> If someone needs or wants default values in the API call then a wrapper 
> functions around the basic keystore APIs can be done by the developer or we 
> can add a new set of APIs to provide that type of feature, just like the 
> variable type APIs. Just as long as the basic APIs do not exclude we can add 
> it later.

Please make them part of the default, so this is easy to use and we don't get 
bifurcated and duplicated boilerplate code.

Matthew.

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Matthew Hall

On Fri, Jun 03, 2016 at 03:18:04PM -0400, Neil Horman wrote:
> I'm not opposed to default values, but it seems to me that if we are splitting
> out a configuration storage library from dpdk, part of the initzliation of 
> that
> library can be installing default values.  That is to say, instead of having 
> the
> code specific areas assume a default value if none is present in the config, 
> an
> init function for the configuration storage library would just populate the
> keystore.  That way all the dpdk itself has to do is a key lookup.
> 
> Neil

I don't think this provides as much mental locality of reference for people 
reading the code. But, an unwanted default argument can be filled with 0 / 
NULL / false as needed.

Matthew.

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Matthew Hall

On Fri, Jun 03, 2016 at 07:07:50PM +, Wiles, Keith wrote:
> If I understand your code above the API would pass in a default value if one 
> did not exist in the storage, which I guess is reasonable. Anyone think this 
> is a good idea or not?

This model has worked very well in my code at least. It keeps good reference 
locality between where the option is accessed and how it is configured and 
what it is called... all to the same line of code.

Matthew.

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Matthew Hall

On Fri, Jun 03, 2016 at 09:52:40PM +0300, Arnon Warshavsky wrote:
> What about the data types of the values?
> I would assume that as a library it can provide the service of typed
> get/set and not leave conversion and validation to the app.
> 
> rte_map_get_int(map,section,key)
> rte_map_get_double(...)
> rte_map_get_string(...)
> rte_map_get_bytes(...,destBuff , destBuffSize) //e.g byte array of RSS key
> 
> This may also allow some basic validity of the configuration file
> 
> Another point I forgot about is default values.
> We sometimes use a notation where the app also specifies a default value in
> case the configuration did not specify it
>   rte_map_get_int(map,section,key , defaultValue )
> and specify if this was a mandatory that has no default
>   rte_map_get_int_crash_if_missing (map,section,key)

This is why I was advising to use something similar to json-c's API with some 
type specific functions to make things easier for the users, with the 
dot-terminated naming made popular by sysctl such as 
"debug.intel.ctr_KMD_CTR_RPUPEI" and so forth.

> /Arnon

Matthew

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Neil Horman

On Fri, Jun 03, 2016 at 06:29:13PM +, Wiles, Keith wrote:
> 
> On 6/3/16, 12:44 PM, "Neil Horman"  wrote:
> 
> >On Fri, Jun 03, 2016 at 04:04:14PM +, Wiles, Keith wrote:
> >> Sorry, I deleted all of the text as it was getting a bit long.
> >> 
> >> Here are my thoughts as of now, which is a combination of many suggestions 
> >> I read from everyone?s emails. I hope this is not too hard to understand.
> >> 
> >> - Break out the current command line options out of the DPDK common code 
> >> and move into a new lib.
> >>   - At this point I was thinking of keeping the rte_eal_init(args, argv) 
> >> API and just have it pass the args/argv to the new lib to create the data 
> >> storage.
> >>  - Maybe move the rte_eal_init() API to the new lib or keep it in the 
> >> common eal code. Do not want to go hog wild.
> >>   - The rte_eal_init(args, argv) would then call to the new API 
> >> rte_eal_initialize(void), which in turn queries the data storage. (still 
> >> thinking here)
> >These three items seem to be the exact opposite of my suggestion.  The point 
> >of
> >this change was to segregate the parsing of configuration away from the
> >initalization dpdk using that configurtion.  By keeping rte_eal_init in such 
> >a
> >way that the command line is directly passed into it, you've not changed that
> >implicit binding to command line options.
> 
> Neil,
> 
> You maybe reading the above wrong or I wrote it wrong, which is a high 
> possibility. I want to move the command line parsing out of DPDK an into a 
> library, but I still believe I need to provide some backward compatibility 
> for ABI and to reduce the learning curve. The current applications can still 
> call the rte_eal_init(), which then calls the new lib parser for dpdk command 
> line options and then calls rte_eal_initialize() or move to the new API 
> rte_eal_initialize() preceded by a new library call to parse the old command 
> line args. At some point we can deprecate the rte_eal_init() if we think it 
> is reasonable.
> 
> >
> >I can understand if you want to keep rte_eal_init as is for ABI purposes, but
> >then you should create an rte_eal_init2(foo), where foo is some handle to in
> >memory parsed configuration, so that applications can preform that 
> >separation.
> 
> I think you describe what I had planned here. The rte_eal_initialize() 
> routine is the new rte_eal_init2() API and the rte_eal_init() was only for 
> backward compatibility was my thinking. I figured the argument to 
> rte_eal_initialize() would be something to be decided, but it will mostly 
> likely be some type of pointer to the storage.
> 
> I hope that clears that up, but let me know.
> 
yes, that clarifies your thinking, and I agree with it.  Thank you!
Neil

> ++Keith
> 
> >
> >Neil
> >
> >>   - The example apps args needs to be passed to the examples as is for 
> >> now, then we can convert them one at a time if needed.
> >> 
> >> - I would like to keep the storage of the data separate from the file 
> >> parser as they can use the ?set? routines to build the data storage up.
> >>   - Keeping them split allows for new parsers to be created, while keeping 
> >> the data storage from changing.
> >> - The rte_cfg code could be modified to use the new configuration if 
> >> someone wants to take on that task ?
> >> 
> >> - Next is the data storage and how we can access the data in a clean 
> >> simple way.
> >> - I want to have some simple level of hierarchy in the data.
> >>   - Having a string containing at least two levels ?primary:secondary?.
> >>  - Primary string is something like ?EAL? or ?Pktgen? or ?testpmd? to 
> >> divide the data storage into logical major groups.
> >> - The primary allows us to have groups and then we can have common 
> >> secondary strings in different groups if needed.
> >>  - Secondary string can be whatever the developer of that group would 
> >> like e.g. simple ?EAL:foobar?, two levels ?testpmd:foo.bar?
> >> 
> >>   - The secondary string is treated as a single string if it has a 
> >> hierarchy or not, but referencing a single value in the data storage.
> >>  - Key value pairs (KVP) or a hashmap data store.
> >> - The key here is the whole string ?EAL:foobar? not just ?foobar? 
> >> secondary string.
> >>- If we want to have the two split I am ok with that as well 
> >> meaning the API would be:
> >>  rte_map_get(mapObj, ?EAL?, ?foo.bar?);
> >>  rte_map_set(mapObj, ?EAL?, ?foo.bar?, value);
> >>- Have the primary as a different section in the data store, 
> >> would allow for dumping that section maybe easier, not sure.
> >>   - I am leaning toward
> >>  - Not going to try splitting up the string or parse it as it is up to 
> >> the developer to make it unique in the data store.
> >> - Use a code design to make the strings simple to use without having typos 
> >> be a problem.
> >>- Not sure what the design is yet, but I do

[dpdk-dev] [PATCH v7 5/5] mbuf: allow apps to change default mempool ops

2016-06-03 Thread Olivier MATZ

> [PATCH v7 5/5] mbuf: allow apps to change default mempool ops

Should the title be fixed?
I don't feel this allows application to change the default ops.

[dpdk-dev] [PATCH v7 2/5] mempool: remove rte_ring from rte_mempool struct

2016-06-03 Thread Olivier MATZ



On 06/02/2016 03:27 PM, David Hunt wrote:
> Now that we're moving to an external mempoool handler, which
> uses a void *pool_data as a pointer to the pool data, remove the
> unneeded ring pointer from the mempool struct.
> 
> Signed-off-by: David Hunt 
> ---
>  app/test/test_mempool_perf.c | 1 -
>  lib/librte_mempool/rte_mempool.h | 1 -
>  2 files changed, 2 deletions(-)
> 
> diff --git a/app/test/test_mempool_perf.c b/app/test/test_mempool_perf.c
> index cdc02a0..091c1df 100644
> --- a/app/test/test_mempool_perf.c
> +++ b/app/test/test_mempool_perf.c
> @@ -161,7 +161,6 @@ per_lcore_mempool_test(__attribute__((unused)) void *arg)
>  n_get_bulk);
>   if (unlikely(ret < 0)) {
>   rte_mempool_dump(stdout, mp);
> - rte_ring_dump(stdout, mp->ring);
>   /* in this case, objects are lost... */
>   return -1;
>   }
> diff --git a/lib/librte_mempool/rte_mempool.h 
> b/lib/librte_mempool/rte_mempool.h
> index a6b28b0..c33eeb8 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -204,7 +204,6 @@ struct rte_mempool_memhdr {
>   */
>  struct rte_mempool {
>   char name[RTE_MEMPOOL_NAMESIZE]; /**< Name of mempool. */
> - struct rte_ring *ring;   /**< Ring to store objects. */
>   union {
>   void *pool_data; /**< Ring or pool to store objects */
>   uint64_t pool_id;/**< External mempool identifier */
> 

Sorry if I missed it in previous discussions, but I don't really
see the point of having this in a separate commit, as the goal
of the previous commit is to replace the ring by configurable ops.

Moreover, after applying only the previous commit, the
call to rte_ring_dump(stdout, mp->ring) would probably crash
sine ring is NULL.

I think this comment also applies to the next commit. Splitting
between functionalities is good, but in this case I think the 3
commits are linked together, and it should not break compilation
or tests to facilitate the git bisect.


Regards,
Olivier

[dpdk-dev] [PATCH v7 1/5] mempool: support external mempool operations

2016-06-03 Thread Olivier MATZ

Hi,

Some comments below.

On 06/02/2016 03:27 PM, David Hunt wrote:
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> +/**
> + * prototype for implementation specific data provisioning function
> + * The function should provide the implementation specific memory for
> + * for use by the other mempool ops functions in a given mempool ops struct.
> + * E.g. the default ops provides an instance of the rte_ring for this 
> purpose.
> + * it will mostlikely point to a different type of data structure, and
> + * will be transparent to the application programmer.
> + * The function should also not touch the given *mp instance.
> + */
> +typedef void *(*rte_mempool_alloc_t)(const struct rte_mempool *mp);

nit: about doxygen comments, it's better to have a one-line description,
then a blank line, then the full description. While there, please also
check the uppercase at the beginning and the dot when relevant.


> +/**
> + * Structure storing the table of registered ops structs, each of which 
> contain
> + * the function pointers for the mempool ops functions.
> + * Each process has it's own storage for this ops struct aray so that

it's -> its
aray -> array


> + * the mempools can be shared across primary and secondary processes.
> + * The indices used to access the array are valid across processes, whereas
> + * any function pointers stored directly in the mempool struct would not be.
> + * This results in us simply having "ops_index" in the mempool struct.
> + */
> +struct rte_mempool_ops_table {
> + rte_spinlock_t sl; /**< Spinlock for add/delete. */
> + uint32_t num_ops;  /**< Number of used ops structs in the table. */
> + /**
> +  * Storage for all possible ops structs.
> +  */
> + struct rte_mempool_ops ops[RTE_MEMPOOL_MAX_OPS_IDX];
> +} __rte_cache_aligned;
> +
> +/** Array of registered ops structs */
> +extern struct rte_mempool_ops_table rte_mempool_ops_table;
> +
> +/**
> + * @internal Get the mempool ops struct from its index.
> + *
> + * @param ops_index
> + *   The index of the ops struct in the ops struct table. It must be a valid
> + *   index: (0 <= idx < num_ops).
> + * @return
> + *   The pointer to the ops struct in the table.
> + */
> +static inline struct rte_mempool_ops *
> +rte_mempool_ops_get(int ops_index)
> +{
> + return _mempool_ops_table.ops[ops_index];
> +}
> +
> +/**
> + * @internal wrapper for external mempool manager alloc callback.

wrapper for mempool_ops alloc callback
?
(same for other functions)


> @@ -922,7 +1124,7 @@ rte_mempool_put(struct rte_mempool *mp, void *obj)
>   */
>  static inline int __attribute__((always_inline))
>  __mempool_get_bulk(struct rte_mempool *mp, void **obj_table,
> -unsigned n, int is_mc)
> +unsigned int n, int is_mc)
>  {
>   int ret;
>   struct rte_mempool_cache *cache;

Despite "unsigned" is not conform to current checkpatch policy, I
don't think it should be modified in this patch.


> --- /dev/null
> +++ b/lib/librte_mempool/rte_mempool_ops.c

> +
> +#include 
> +
> +/* indirect jump table to support external memory pools */
> +struct rte_mempool_ops_table rte_mempool_ops_table = {
> + .sl =  RTE_SPINLOCK_INITIALIZER ,
> + .num_ops = 0
> +};
> +
> +/* add a new ops struct in rte_mempool_ops_table, return its index */
> +int
> +rte_mempool_ops_register(const struct rte_mempool_ops *h)

nit: "h" should be "ops" :)


> +{
> + struct rte_mempool_ops *ops;
> + int16_t ops_index;
> +
> + rte_spinlock_lock(_mempool_ops_table.sl);
> +
> + if (rte_mempool_ops_table.num_ops >=
> + RTE_MEMPOOL_MAX_OPS_IDX) {
> + rte_spinlock_unlock(_mempool_ops_table.sl);
> + RTE_LOG(ERR, MEMPOOL,
> + "Maximum number of mempool ops structs exceeded\n");
> + return -ENOSPC;
> + }
> +
> + if (h->put == NULL || h->get == NULL || h->get_count == NULL) {
> + rte_spinlock_unlock(_mempool_ops_table.sl);
> + RTE_LOG(ERR, MEMPOOL,
> + "Missing callback while registering mempool ops\n");
> + return -EINVAL;
> + }
> +
> + ops_index = rte_mempool_ops_table.num_ops++;
> + ops = _mempool_ops_table.ops[ops_index];
> + snprintf(ops->name, sizeof(ops->name), "%s", h->name);

I think we should check for truncation here, as it was done in:
85cf00791cca ("mem: avoid memzone/mempool/ring name truncation")
(this should be done before the num_ops++)


Regards,
Olivier

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Neil Horman

On Fri, Jun 03, 2016 at 04:04:14PM +, Wiles, Keith wrote:
> Sorry, I deleted all of the text as it was getting a bit long.
> 
> Here are my thoughts as of now, which is a combination of many suggestions I 
> read from everyone?s emails. I hope this is not too hard to understand.
> 
> - Break out the current command line options out of the DPDK common code and 
> move into a new lib.
>   - At this point I was thinking of keeping the rte_eal_init(args, argv) API 
> and just have it pass the args/argv to the new lib to create the data storage.
>  - Maybe move the rte_eal_init() API to the new lib or keep it in the 
> common eal code. Do not want to go hog wild.
>   - The rte_eal_init(args, argv) would then call to the new API 
> rte_eal_initialize(void), which in turn queries the data storage. (still 
> thinking here)
These three items seem to be the exact opposite of my suggestion.  The point of
this change was to segregate the parsing of configuration away from the
initalization dpdk using that configurtion.  By keeping rte_eal_init in such a
way that the command line is directly passed into it, you've not changed that
implicit binding to command line options.

I can understand if you want to keep rte_eal_init as is for ABI purposes, but
then you should create an rte_eal_init2(foo), where foo is some handle to in
memory parsed configuration, so that applications can preform that separation.

Neil

>   - The example apps args needs to be passed to the examples as is for now, 
> then we can convert them one at a time if needed.
> 
> - I would like to keep the storage of the data separate from the file parser 
> as they can use the ?set? routines to build the data storage up.
>   - Keeping them split allows for new parsers to be created, while keeping 
> the data storage from changing.
> - The rte_cfg code could be modified to use the new configuration if someone 
> wants to take on that task ?
> 
> - Next is the data storage and how we can access the data in a clean simple 
> way.
> - I want to have some simple level of hierarchy in the data.
>   - Having a string containing at least two levels ?primary:secondary?.
>  - Primary string is something like ?EAL? or ?Pktgen? or ?testpmd? to 
> divide the data storage into logical major groups.
> - The primary allows us to have groups and then we can have common 
> secondary strings in different groups if needed.
>  - Secondary string can be whatever the developer of that group would 
> like e.g. simple ?EAL:foobar?, two levels ?testpmd:foo.bar?
> 
>   - The secondary string is treated as a single string if it has a hierarchy 
> or not, but referencing a single value in the data storage.
>  - Key value pairs (KVP) or a hashmap data store.
> - The key here is the whole string ?EAL:foobar? not just ?foobar? 
> secondary string.
>- If we want to have the two split I am ok with that as well 
> meaning the API would be:
>  rte_map_get(mapObj, ?EAL?, ?foo.bar?);
>  rte_map_set(mapObj, ?EAL?, ?foo.bar?, value);
>- Have the primary as a different section in the data store, would 
> allow for dumping that section maybe easier, not sure.
>   - I am leaning toward
>  - Not going to try splitting up the string or parse it as it is up to 
> the developer to make it unique in the data store.
> - Use a code design to make the strings simple to use without having typos be 
> a problem.
>- Not sure what the design is yet, but I do not want to have to concat two 
> string or split strings in the code.
> 
> This is as far as I have gotten and got tired of typing ?
> 
> I hope this will satisfy most everyone?s needs for now.
> 
> 
> Regards,
> Keith
> 
> 
>

[dpdk-dev] [PATCH v7 1/5] mempool: support external mempool operations

2016-06-03 Thread Jan Viktorin

Hello,

sorry for top-posting. I vote for passing rte_mempool instead of void *. This 
is what I've already pointed at, a kind of type-safety...? Passing uint64_t 
just hides the problem. Another way is to name the union and pass it as eg. 
union rte_mempool_priv * to the callbacks.

Jan?Viktorin
RehiveTech
Sent?from?a?mobile?device
? P?vodn? zpr?va ?
Od: Olivier MATZ
Odesl?no: p?tek, 3. ?ervna 2016 13:08
Komu: Hunt, David; Jerin Jacob
Kopie: dev at dpdk.org; viktorin at rehivetech.com
P?edm?t: Re: [PATCH v7 1/5] mempool: support external mempool operations

On 06/03/2016 12:28 PM, Hunt, David wrote:
> On 6/3/2016 7:38 AM, Jerin Jacob wrote:
>> On Thu, Jun 02, 2016 at 02:27:19PM +0100, David Hunt wrote:
>>> +/**
>>> + * @internal wrapper for external mempool manager put callback.
>>> + *
>>> + * @param mp
>>> + * Pointer to the memory pool.
>>> + * @param obj_table
>>> + * Pointer to a table of void * pointers (objects).
>>> + * @param n
>>> + * Number of objects to put.
>>> + * @return
>>> + * - 0: Success; n objects supplied.
>>> + * - <0: Error; code of put function.
>>> + */
>>> +static inline int
>>> +rte_mempool_ops_enqueue_bulk(struct rte_mempool *mp, void * const
>>> *obj_table,
>>> + unsigned n)
>>> +{
>>> + struct rte_mempool_ops *ops;
>>> +
>>> + ops = rte_mempool_ops_get(mp->ops_index);
>>> + return ops->put(mp->pool_data, obj_table, n);
>> Pass by value of "pool_data", On 32 bit systems, casting back to
>> pool_id will
>> be an issue as void* on 32 bit is 4B. IMO, May be can use uint64_t to
>> pass by value and typecast to void* to fix it.
> 
> OK. I see the problem. I'll see 4 callbacks that need to change, free,
> get, put and get_count.
> So the callbacks will be:
> typedef void *(*rte_mempool_alloc_t)(struct rte_mempool *mp);
> typedef void (*rte_mempool_free_t)(uint64_t p);
> typedef int (*rte_mempool_put_t)(uint64_t p, void * const *obj_table,
> unsigned int n);
> typedef int (*rte_mempool_get_t)(uint64_t p, void **obj_table, unsigned
> int n);
> typedef unsigned (*rte_mempool_get_count)(uint64_t p);

I don't quite like the uint64_t argument (I exepect that most handlers
will use a pointer, and they will have to do a cast). What about giving
a (struct rte_mempool *) instead? The handler function would then
select between void * or uint64_t without a cast.
In that case, maybe the prototype of alloc should be:

typedef int (*rte_mempool_alloc_t)(struct rte_mempool *mp);

It would directly set mp->pool_data and return 0 or -errno.

By the way, I found a strange thing:

> typedef void (*rte_mempool_free_t)(void *p);

[...]

> void
> rte_mempool_ops_free(struct rte_mempool *mp)
> {
> struct rte_mempool_ops *ops;
> 
> ops = rte_mempool_ops_get(mp->ops_index);
> if (ops->free == NULL)
> return;
> return ops->free(mp);
> }
> 

Seems that the free cb expects mp->pool_data, but mp is passed.

Another alternative to the "uint64_t or ptr" question would be to use
a uintptr_t instead of a uint64_t. This is won't be possible if it need
to be a 64 bits value even on 32 bits architectures. We could then keep
the argument as pointer, and cast it to uintptr_t if needed.

Regards,
Olivier

[dpdk-dev] [PATCH v7 1/5] mempool: support external mempool operations

2016-06-03 Thread Hunt, David



On 6/3/2016 12:07 PM, Olivier MATZ wrote:
>
> On 06/03/2016 12:28 PM, Hunt, David wrote:
>> On 6/3/2016 7:38 AM, Jerin Jacob wrote:
>>> On Thu, Jun 02, 2016 at 02:27:19PM +0100, David Hunt wrote:
 +/**
 + * @internal wrapper for external mempool manager put callback.
 + *
 + * @param mp
 + *   Pointer to the memory pool.
 + * @param obj_table
 + *   Pointer to a table of void * pointers (objects).
 + * @param n
 + *   Number of objects to put.
 + * @return
 + *   - 0: Success; n objects supplied.
 + *   - <0: Error; code of put function.
 + */
 +static inline int
 +rte_mempool_ops_enqueue_bulk(struct rte_mempool *mp, void * const
 *obj_table,
 +unsigned n)
 +{
 +struct rte_mempool_ops *ops;
 +
 +ops = rte_mempool_ops_get(mp->ops_index);
 +return ops->put(mp->pool_data, obj_table, n);
>>> Pass by value of "pool_data", On 32 bit systems, casting back to
>>> pool_id will
>>> be an issue as void* on 32 bit is 4B. IMO, May be can use uint64_t to
>>> pass by value and typecast to void* to fix it.
>> OK. I see the problem. I'll see 4 callbacks that need to change, free,
>> get, put and get_count.
>> So the callbacks will be:
>> typedef void *(*rte_mempool_alloc_t)(struct rte_mempool *mp);
>> typedef void (*rte_mempool_free_t)(uint64_t p);
>> typedef int (*rte_mempool_put_t)(uint64_t p, void * const *obj_table,
>> unsigned int n);
>> typedef int (*rte_mempool_get_t)(uint64_t p, void **obj_table, unsigned
>> int n);
>> typedef unsigned (*rte_mempool_get_count)(uint64_t p);
> I don't quite like the uint64_t argument (I exepect that most handlers
> will use a pointer, and they will have to do a cast). What about giving
> a (struct rte_mempool *) instead? The handler function would then
> select between void * or uint64_t without a cast.
> In that case, maybe the prototype of alloc should be:
>
>typedef int (*rte_mempool_alloc_t)(struct rte_mempool *mp);
>
> It would directly set mp->pool_data and return 0 or -errno.

I would tend to agree. The uint64 didn't sit well with me :)
I would prefer the rte_mempool*


> By the way, I found a strange thing:
>
>> typedef void (*rte_mempool_free_t)(void *p);

Yes, I spotted that earler, will be fixed in next version


> [...]
>
>> void
>> rte_mempool_ops_free(struct rte_mempool *mp)
>> {
>> struct rte_mempool_ops *ops;
>>
>> ops = rte_mempool_ops_get(mp->ops_index);
>> if (ops->free == NULL)
>> return;
>> return ops->free(mp);
>> }
>>
> Seems that the free cb expects mp->pool_data, but mp is passed.

Working in it.

>
>
> Another alternative to the "uint64_t or ptr" question would be to use
> a uintptr_t instead of a uint64_t. This is won't be possible if it need
> to be a 64 bits value even on 32 bits architectures. We could then keep
> the argument as pointer, and cast it to uintptr_t if needed.

I had assumed that the requirement was for 64 bits even on 32 bit OS's. 
I've implemented
the double cast of the u64 to uintptr_t to struct pointer  done to avoid 
compiler warnings on
32-bit but I really prefer the solution of passing the rte_mempool 
pointer instead. I'll change to
*mp.

Regards,
Dave.

[dpdk-dev] [PATCH v7 1/5] mempool: support external mempool operations

2016-06-03 Thread Olivier MATZ



On 06/03/2016 12:28 PM, Hunt, David wrote:
> On 6/3/2016 7:38 AM, Jerin Jacob wrote:
>> On Thu, Jun 02, 2016 at 02:27:19PM +0100, David Hunt wrote:
>>> +/**
>>> + * @internal wrapper for external mempool manager put callback.
>>> + *
>>> + * @param mp
>>> + *   Pointer to the memory pool.
>>> + * @param obj_table
>>> + *   Pointer to a table of void * pointers (objects).
>>> + * @param n
>>> + *   Number of objects to put.
>>> + * @return
>>> + *   - 0: Success; n objects supplied.
>>> + *   - <0: Error; code of put function.
>>> + */
>>> +static inline int
>>> +rte_mempool_ops_enqueue_bulk(struct rte_mempool *mp, void * const
>>> *obj_table,
>>> +unsigned n)
>>> +{
>>> +struct rte_mempool_ops *ops;
>>> +
>>> +ops = rte_mempool_ops_get(mp->ops_index);
>>> +return ops->put(mp->pool_data, obj_table, n);
>> Pass by value of "pool_data", On 32 bit systems, casting back to
>> pool_id will
>> be an issue as void* on 32 bit is 4B. IMO, May be can use uint64_t to
>> pass by value and typecast to void* to fix it.
> 
> OK. I see the problem. I'll see 4 callbacks that need to change, free,
> get, put and get_count.
> So the callbacks will be:
> typedef void *(*rte_mempool_alloc_t)(struct rte_mempool *mp);
> typedef void (*rte_mempool_free_t)(uint64_t p);
> typedef int (*rte_mempool_put_t)(uint64_t p, void * const *obj_table,
> unsigned int n);
> typedef int (*rte_mempool_get_t)(uint64_t p, void **obj_table, unsigned
> int n);
> typedef unsigned (*rte_mempool_get_count)(uint64_t p);

I don't quite like the uint64_t argument (I exepect that most handlers
will use a pointer, and they will have to do a cast). What about giving
a (struct rte_mempool *) instead? The handler function would then
select between void * or uint64_t without a cast.
In that case, maybe the prototype of alloc should be:

  typedef int (*rte_mempool_alloc_t)(struct rte_mempool *mp);

It would directly set mp->pool_data and return 0 or -errno.

By the way, I found a strange thing:

> typedef void (*rte_mempool_free_t)(void *p);

[...]

> void
> rte_mempool_ops_free(struct rte_mempool *mp)
> {
> struct rte_mempool_ops *ops;
> 
> ops = rte_mempool_ops_get(mp->ops_index);
> if (ops->free == NULL)
> return;
> return ops->free(mp);
> }
> 

Seems that the free cb expects mp->pool_data, but mp is passed.



Another alternative to the "uint64_t or ptr" question would be to use
a uintptr_t instead of a uint64_t. This is won't be possible if it need
to be a 64 bits value even on 32 bits architectures. We could then keep
the argument as pointer, and cast it to uintptr_t if needed.

Regards,
Olivier

[dpdk-dev] [PATCH v2] mempool: replace c memcpy code semantics with optimized rte_memcpy

2016-06-03 Thread Jerin Jacob

On Thu, Jun 02, 2016 at 11:16:16PM +0200, Olivier MATZ wrote:
Hi Olivier,

> This is probably more a measure of the pure CPU cost of the mempool
> function, without considering the memory cache aspect. So, of course,
> a real use-case test should be done to confirm or not that it increases
> the performance. I'll manage to do a test and let you know the result.

OK

IMO, put rte_memcpy makes sense(this patch) as their no behavior change.
However, if get rte_memcpy with behavioral changes makes sense some platform
then we can enable it on conditional basics(I am OK with that)

> 
> By the way, not all drivers are allocating or freeing the mbufs by
> bulk, so this modification would only affect these ones. What driver
> are you using for your test?

I have tested with ThunderX nicvf pmd(uses the bulk mode).
Recently sent out driver in ml for review

Jerin

> 
> 
> Regards,
> Olivier
> 
>

[dpdk-dev] [PATCH v5 0/6] Virtio-net PMD: QEMU QTest extension for container

2016-06-03 Thread Yuanhan Liu

On Thu, Jun 02, 2016 at 06:30:18PM +0900, Tetsuya Mukawa wrote:
> Hi Yuanhan,
> 
> On 2016/06/02 16:31, Yuanhan Liu wrote:
> > But still, I'd ask do we really need 2 virtio for container solutions?
> 
> I appreciate your comments.

No, I appreciate your effort for contributing to DPDK! vhost-pmd stuff
is just brilliant!

> Let me have time to discuss it with our team.

I'm wondering could we have one solution only. IMO, the drawback of
having two (quite different) solutions might outweighs the benefit
it takes. Say, it might just confuse user.

OTOH, I'm wondering could you adapt to Jianfeng's solution? If not,
what's the missing parts, and could we fix it? I'm thinking having
one unified solution will keep ours energy/focus on one thing, making
it better and better! Having two just splits the energy; it also
introduces extra burden for maintaining.

--yliu

[dpdk-dev] [PATCH v7 1/5] mempool: support external mempool operations

2016-06-03 Thread Jerin Jacob

On Thu, Jun 02, 2016 at 02:27:19PM +0100, David Hunt wrote:

[snip]

>   /* create the internal ring if not already done */
>   if ((mp->flags & MEMPOOL_F_RING_CREATED) == 0) {

|> 1) May be RING can be replaced with some other higher abstraction name
|> for the internal MEMPOOL_F_RING_CREATED flag
|
|Agreed. I'll change to MEMPOOL_F_POOL_CREATED, since we're already 
|changing the *ring
|element of the mempool struct to *pool

Looks like you have missed the above review comment?

[snip]

> static inline struct rte_mempool_ops *
> rte_mempool_ops_get(int ops_index)
>
> return _mempool_ops_table.ops[ops_index];

|> 2) Considering "get" and "put" are the fast-path callbacks for
|> pool-manger, Is it possible to avoid the extra overhead of the
|> following
|> _load_ and additional cache line on each call,
|> rte_mempool_handler_table.handler[handler_idx]
|>
|> I understand it is for multiprocess support but I am thing can we
|> introduce something like ethernet API support for multiprocess and
|> resolve "put" and "get" functions pointer on init and store in
|> struct mempool. Some thinking like
|>
|> file: drivers/net/ixgbe/ixgbe_ethdev.c
|> search for if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
|
|I'll look at this one before posting the next version of the patch 
|(soon). :)

Have you checked the above comment, if it difficult then postpone it. But
IMO it will reduce few cycles in fast-path and reduce the cache usage in
fast path

[snip]

> +
> +/**
> + * @internal wrapper for external mempool manager put callback.
> + *
> + * @param mp
> + *   Pointer to the memory pool.
> + * @param obj_table
> + *   Pointer to a table of void * pointers (objects).
> + * @param n
> + *   Number of objects to put.
> + * @return
> + *   - 0: Success; n objects supplied.
> + *   - <0: Error; code of put function.
> + */
> +static inline int
> +rte_mempool_ops_enqueue_bulk(struct rte_mempool *mp, void * const *obj_table,
> + unsigned n)
> +{
> + struct rte_mempool_ops *ops;
> +
> + ops = rte_mempool_ops_get(mp->ops_index);
> + return ops->put(mp->pool_data, obj_table, n);

Pass by value of "pool_data", On 32 bit systems, casting back to pool_id will
be an issue as void* on 32 bit is 4B. IMO, May be can use uint64_t to
pass by value and typecast to void* to fix it.

Jerin

[dpdk-dev] [PATCH] ivshmem: add all memzones of mempool to metada

2016-06-03 Thread Ferruh Yigit

On 6/2/2016 8:04 AM, Olivier MATZ wrote:
> Hi Ferruh,
> 
> Thank you for fixing this issue.
> 
> On 06/01/2016 03:18 PM, Ferruh Yigit wrote:
>> [PATCH] ivshmem: add all memzones of mempool to metada
> 
> Minor comment: it seems the title is truncated
> 

Right, I will fix in next version of patch.

>> +static int
>> +add_mempool_to_metadata(const struct rte_mempool *mp,
>> +struct ivshmem_config *config)
>> +{
>> +struct rte_mempool_memhdr *memhdr;
>> +int ret;
>> +
>> +ret = add_mempool_memzone_to_metadata(mp, config);
>>  if (ret < 0)
>>  return -1;
>>  
>> +STAILQ_FOREACH(memhdr, >mem_list, next) {
>> +ret = add_mempool_memzone_to_metadata(memhdr->addr, config);
>> +if (ret < 0)
>> +return -1;
>> +}
>> +
>> +/* mempool consists of memzone and ring */
>>  return add_ring_to_metadata(mp->ring, config);
>>  }
>>  
> 
> In case you missed it: there is a function
> rte_mempool_mem_iter() that can be used to browse the
> memory chunks of a mempool. It's probably less convenient
> to use compared to directly browsing the list, but it
> may be more resistant to api changes.

I wasn't aware rte_mempool_mem_iter(), I will update the patch to use this.

> 
> Apart from that:
> Acked-by: Olivier Matz 
> 
> Thanks
> 

Thanks,
ferruh

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Bruce Richardson

On Fri, Jun 03, 2016 at 11:29:43AM +0100, Bruce Richardson wrote:
> On Thu, Jun 02, 2016 at 04:08:37PM -0400, Neil Horman wrote:
> > On Thu, Jun 02, 2016 at 07:41:10PM +, Wiles, Keith wrote:
> > > 
> > > On 6/2/16, 12:11 PM, "Neil Horman"  wrote:
> > > 
> > > >
> > > >1) The definition of a config structure that can be passed to 
> > > >rte_eal_init,
> > > >defining the configuration for that running process
> > > 
> > > Having a configuration structure means we have to have an ABI change to 
> > > that structure anytime we add or remove an option. I was thinking a very 
> > > simple DB of some kind would be better. Have the code query the DB to 
> > > obtain the needed information. The APIs used to query and set the DB 
> > > needs to be very easy to use as well.
> > 
> > Thats a fair point.  A decent starting point is likely a simple struct that
> > looks like this:
> > 
> > struct key_vals {
> > char *key;
> > union {
> > ulong longval;
> > void *ptrval;
> > } value;
> > };
> > 
> > struct config {
> > size_t count;
> > struct key_vals kvp[0];
> > };
> > 
> > > 
> > > Maybe each option can define its own structure if needed or just a simple 
> > > variable type can be used for the basic types (int, string, bool, ?)
> > > 
> > Well, if you have config sections that require mulitiple elements, I'd 
> > handle
> > that with naming, i.e. if you have a config group that has an int and char
> > value, I'd name them "group.intval", and "group.charval", so they are
> > independently searchable, but linked from a nomenclature standpoint.
> > 
> > > Would this work better in the long run, does a fixed structure still make 
> > > sense?
> > > 
> > No. I think you're ABI concerns are valid, but the above is likely a good
> > starting point to address them.
> > 
> > Best
> > Neil
> 
> I'll throw out one implementation idea here that I looked at previously, for
> the reason that it was simple enough implement with existing code.
> 
> We already have the cfgfile library which works with name/value pairs read 
> from
> ini files on disk. However, it would be easy enough to add couple of APIs to
> that to allow the user to "set" values inside an ini structure as well. With
> that done we can then just add a new eal_init api which takes a single
> "struct rte_cfgfile *" as parameter. For those apps that want to just use
> inifiles for configuration straight, they can then do:
> 
> cfg = rte_cfgfile_load("my_cfg_file");
> rte_eal_newinit(cfg);
> 
> Those who want a different config can instead do:
> 
> cfg = rte_cfgfile_new();
> rte_cfgfile_add_section(cfg, "dpdk");
> foreach_eal_setting_wanted:
>   rte_cfgfile_set(cfg, "dpdk", mysetting, myvalue);
> rte_eal_newinit(cfg);
> 
>From chatting to a couple of other DPDK dev's here I suspect I may not have
been entirely clear here with this example. What is being shown above is 
building
up a "config-file" in memory - or rather a config structure which happens to
have the idea of sections and values as an ini file has. There is no actual
file ever being written to disk, and for those using any non-ini config file
structure for their app, the code overhead of using the APIs above should be 
pretty much the same as building up any other set of key-value pairs in
memory to pass to an init function.

Hope this is a little clearer now.

/Bruce

[dpdk-dev] [PATCH v2] doc: add pep8 as the python code style guidelines

2016-06-03 Thread Bruce Richardson

On Fri, May 27, 2016 at 03:57:15PM +0100, John McNamara wrote:
> Added PEP8 to the DPDK Coding Style guidelines to cover Python
> contributions to DPDK.
> 
> Signed-off-by: John McNamara 
> ---
> v2: fixed section level.
> 
> 
>  doc/guides/contributing/coding_style.rst | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/doc/guides/contributing/coding_style.rst 
> b/doc/guides/contributing/coding_style.rst
> index ad1392d..1eb67f3 100644
> --- a/doc/guides/contributing/coding_style.rst
> +++ b/doc/guides/contributing/coding_style.rst
> @@ -685,3 +685,11 @@ Control Statements
>   usage();
>   /* NOTREACHED */
>   }
> +
> +
> +Python Code
> +---
> +
> +All python code should be compliant with `PEP8 (Style Guide for Python Code) 
> `_.
> +
> +The ``pep8`` tool can be used for testing compliance with the guidelines.
> -- 
Yep, I think this would be good to have, especially since there is a nice
public tool to help ensure people are compliant with this. Even though all
existing code is not (yet) compliant no reason new code shouldn't be.

Acked-by: Bruce Richardson

[dpdk-dev] Can't build DPDK-16.04 on CentOS 6.8

2016-06-03 Thread Ferruh Yigit

On 6/1/2016 9:07 PM, Martinx - ? wrote:
> Guys,
> 
>  I'm trying to build DPDK-16.04 on CentOS 6.8, but it is failing, here is
> the error:
> 
> ---
> ...
> == Build lib/librte_eal/linuxapp
> == Build lib/librte_eal/linuxapp/eal
> == Build lib/librte_eal/linuxapp/igb_uio
>   CC eal.o
>   CC eal_hugepage_info.o
>   CC eal_memory.o
>   LD
>  
> /root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/built-in.o
>   CC [M]
>  
> /root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.o
>   CC eal_thread.o
> /root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:
> In function 'igbuio_msix_mask_irq':
> /root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:157:
> error: 'PCI_MSIX_ENTRY_CTRL_MASKBIT' undeclared (first use in this function)
> /root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:157:
> error: (Each undeclared identifier is reported only once
> /root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:157:
> error: for each function it appears in.)
> make[8]: ***
> [/root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.o]
> Error 1
> make[7]: ***
> [_module_/root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio]
> Error 2
> make[6]: *** [sub-make] Error 2
> make[5]: *** [igb_uio.ko] Error 2
> make[4]: *** [igb_uio] Error 2
> make[4]: *** Waiting for unfinished jobs
>   CC eal_log.o
>   CC eal_pci.o
>   CC eal_pci_uio.o
>   CC eal_pci_vfio.o
>   CC eal_pci_vfio_mp_sync.o
> ...
> ---
> 
>  Any clue?
> 
>  I'm trying to build it by running:
> 
> --
> rpmbuild --ba /root/rpmbuild/SPECS/dpdk.spec
> --
> 
>  I removed the "doc" and the need for Xen out of it... I can take this spec
> and the dpdk-16.04.tar.gz and build it on CentOS 7.
> 
> Thanks!
> Thiago
> 

Hi Thiago,

As a reference, I tested spec file on Fedora 23, compilation worked fine.
Only I found you need to set RTE_TARGET=x86_64-default-linuxapp-gcc
before building, to be able to package all files.

meanwhile PCI_MSIX_ENTRY_CTRL_MASKBIT is defined for kernels >= 2.6.38,
BUT it already defined in igb_uio/compat.h for the case kernel headers
don't have it, so it is not related to the kernel versions, not sure
about source of the error.

Thanks,
ferruh

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Bruce Richardson

On Thu, Jun 02, 2016 at 04:08:37PM -0400, Neil Horman wrote:
> On Thu, Jun 02, 2016 at 07:41:10PM +, Wiles, Keith wrote:
> > 
> > On 6/2/16, 12:11 PM, "Neil Horman"  wrote:
> > 
> > >
> > >1) The definition of a config structure that can be passed to rte_eal_init,
> > >defining the configuration for that running process
> > 
> > Having a configuration structure means we have to have an ABI change to 
> > that structure anytime we add or remove an option. I was thinking a very 
> > simple DB of some kind would be better. Have the code query the DB to 
> > obtain the needed information. The APIs used to query and set the DB needs 
> > to be very easy to use as well.
> 
> Thats a fair point.  A decent starting point is likely a simple struct that
> looks like this:
> 
> struct key_vals {
>   char *key;
>   union {
>   ulong longval;
>   void *ptrval;
>   } value;
> };
> 
> struct config {
>   size_t count;
>   struct key_vals kvp[0];
> };
> 
> > 
> > Maybe each option can define its own structure if needed or just a simple 
> > variable type can be used for the basic types (int, string, bool, ?)
> > 
> Well, if you have config sections that require mulitiple elements, I'd handle
> that with naming, i.e. if you have a config group that has an int and char
> value, I'd name them "group.intval", and "group.charval", so they are
> independently searchable, but linked from a nomenclature standpoint.
> 
> > Would this work better in the long run, does a fixed structure still make 
> > sense?
> > 
> No. I think you're ABI concerns are valid, but the above is likely a good
> starting point to address them.
> 
> Best
> Neil

I'll throw out one implementation idea here that I looked at previously, for
the reason that it was simple enough implement with existing code.

We already have the cfgfile library which works with name/value pairs read from
ini files on disk. However, it would be easy enough to add couple of APIs to
that to allow the user to "set" values inside an ini structure as well. With
that done we can then just add a new eal_init api which takes a single
"struct rte_cfgfile *" as parameter. For those apps that want to just use
inifiles for configuration straight, they can then do:

cfg = rte_cfgfile_load("my_cfg_file");
rte_eal_newinit(cfg);

Those who want a different config can instead do:

cfg = rte_cfgfile_new();
rte_cfgfile_add_section(cfg, "dpdk");
foreach_eal_setting_wanted:
rte_cfgfile_set(cfg, "dpdk", mysetting, myvalue);
rte_eal_newinit(cfg);

We can standardize on a sectionname, or a couple of standard section names that
are used by DPDK, so that the rest of the config file can contain other data
for the app itself.

What do people think. I mainly like it because it gives us good reuse of what
is already there, and enhances our existing library. As well as this it makes
it trivially easy for apps to use ini files - which seem to be very popular here
- while still giving flexibility for others to use whatever other config format
their app prefers.

/Bruce

[dpdk-dev] [PATCH v7 1/5] mempool: support external mempool operations

2016-06-03 Thread Hunt, David



On 6/3/2016 7:38 AM, Jerin Jacob wrote:
> On Thu, Jun 02, 2016 at 02:27:19PM +0100, David Hunt wrote:
>
> [snip]
>
>>  /* create the internal ring if not already done */
>>  if ((mp->flags & MEMPOOL_F_RING_CREATED) == 0) {
> |> 1) May be RING can be replaced with some other higher abstraction name
> |> for the internal MEMPOOL_F_RING_CREATED flag
> |
> |Agreed. I'll change to MEMPOOL_F_POOL_CREATED, since we're already
> |changing the *ring
> |element of the mempool struct to *pool
>
> Looks like you have missed the above review comment?

Ah, yes, I'll address in the next patch.

I've to remove some 'const's in the alloc functions anyway
which I introduced in the last revision, which I shouldn't have. Future 
handlers
(including the stack hander) will need to set the pool_data in the alloc.


>
> [snip]
>
>> static inline struct rte_mempool_ops *
>> rte_mempool_ops_get(int ops_index)
>>
>> return _mempool_ops_table.ops[ops_index];
> |> 2) Considering "get" and "put" are the fast-path callbacks for
> |> pool-manger, Is it possible to avoid the extra overhead of the
> |> following
> |> _load_ and additional cache line on each call,
> |> rte_mempool_handler_table.handler[handler_idx]
> |>
> |> I understand it is for multiprocess support but I am thing can we
> |> introduce something like ethernet API support for multiprocess and
> |> resolve "put" and "get" functions pointer on init and store in
> |> struct mempool. Some thinking like
> |>
> |> file: drivers/net/ixgbe/ixgbe_ethdev.c
> |> search for if (rte_eal_process_type() != RTE_PROC_PRIMARY) {
> |
> |I'll look at this one before posting the next version of the patch
> |(soon). :)
>
> Have you checked the above comment, if it difficult then postpone it. But
> IMO it will reduce few cycles in fast-path and reduce the cache usage in
> fast path
>
> [snip]

I looked at it, but I'd need to do some more digging into it to see how 
it can be
done properly. I'm not seeing any performance drop at the moment, and it 
may
be a good way to improve further down the line. Is it OK to postpone?

>> +
>> +/**
>> + * @internal wrapper for external mempool manager put callback.
>> + *
>> + * @param mp
>> + *   Pointer to the memory pool.
>> + * @param obj_table
>> + *   Pointer to a table of void * pointers (objects).
>> + * @param n
>> + *   Number of objects to put.
>> + * @return
>> + *   - 0: Success; n objects supplied.
>> + *   - <0: Error; code of put function.
>> + */
>> +static inline int
>> +rte_mempool_ops_enqueue_bulk(struct rte_mempool *mp, void * const 
>> *obj_table,
>> +unsigned n)
>> +{
>> +struct rte_mempool_ops *ops;
>> +
>> +ops = rte_mempool_ops_get(mp->ops_index);
>> +return ops->put(mp->pool_data, obj_table, n);
> Pass by value of "pool_data", On 32 bit systems, casting back to pool_id will
> be an issue as void* on 32 bit is 4B. IMO, May be can use uint64_t to
> pass by value and typecast to void* to fix it.

OK. I see the problem. I'll see 4 callbacks that need to change, free, 
get, put and get_count.
So the callbacks will be:
typedef void *(*rte_mempool_alloc_t)(struct rte_mempool *mp);
typedef void (*rte_mempool_free_t)(uint64_t p);
typedef int (*rte_mempool_put_t)(uint64_t p, void * const *obj_table, 
unsigned int n);
typedef int (*rte_mempool_get_t)(uint64_t p, void **obj_table, unsigned 
int n);
typedef unsigned (*rte_mempool_get_count)(uint64_t p);


Regards,
Dave.

[dpdk-dev] Can't build DPDK-16.04 on CentOS 6.8

2016-06-03 Thread Martinx - ジェームズ

On 3 June 2016 at 06:45, Ferruh Yigit  wrote:

> On 6/1/2016 9:07 PM, Martinx - ? wrote:
> > Guys,
> >
> >  I'm trying to build DPDK-16.04 on CentOS 6.8, but it is failing, here is
> > the error:
> >
> > ---
> > ...
> > == Build lib/librte_eal/linuxapp
> > == Build lib/librte_eal/linuxapp/eal
> > == Build lib/librte_eal/linuxapp/igb_uio
> >   CC eal.o
> >   CC eal_hugepage_info.o
> >   CC eal_memory.o
> >   LD
> >
> /root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/built-in.o
> >   CC [M]
> >
> /root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.o
> >   CC eal_thread.o
> >
> /root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:
> > In function 'igbuio_msix_mask_irq':
> >
> /root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:157:
> > error: 'PCI_MSIX_ENTRY_CTRL_MASKBIT' undeclared (first use in this
> function)
> >
> /root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:157:
> > error: (Each undeclared identifier is reported only once
> >
> /root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:157:
> > error: for each function it appears in.)
> > make[8]: ***
> >
> [/root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.o]
> > Error 1
> > make[7]: ***
> >
> [_module_/root/rpmbuild/BUILD/dpdk-16.04/x86_64-default-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio]
> > Error 2
> > make[6]: *** [sub-make] Error 2
> > make[5]: *** [igb_uio.ko] Error 2
> > make[4]: *** [igb_uio] Error 2
> > make[4]: *** Waiting for unfinished jobs
> >   CC eal_log.o
> >   CC eal_pci.o
> >   CC eal_pci_uio.o
> >   CC eal_pci_vfio.o
> >   CC eal_pci_vfio_mp_sync.o
> > ...
> > ---
> >
> >  Any clue?
> >
> >  I'm trying to build it by running:
> >
> > --
> > rpmbuild --ba /root/rpmbuild/SPECS/dpdk.spec
> > --
> >
> >  I removed the "doc" and the need for Xen out of it... I can take this
> spec
> > and the dpdk-16.04.tar.gz and build it on CentOS 7.
> >
> > Thanks!
> > Thiago
> >
>
> Hi Thiago,
>
> As a reference, I tested spec file on Fedora 23, compilation worked fine.
> Only I found you need to set RTE_TARGET=x86_64-default-linuxapp-gcc
> before building, to be able to package all files.
>
> meanwhile PCI_MSIX_ENTRY_CTRL_MASKBIT is defined for kernels >= 2.6.38,
> BUT it already defined in igb_uio/compat.h for the case kernel headers
> don't have it, so it is not related to the kernel versions, not sure
> about source of the error.
>
> Thanks,
> ferruh
>

Hello Ferruh,

 The dpdk.spec file already tries to compile it by using
"RTE_TARGET=x86_64-default-linuxapp-gcc", look:

---
# rpmbuild --ba /root/rpmbuild/SPECS/dpdk.spec
.
+ unset DISPLAY
+ make O=x86_64-default-linuxapp-gcc T=x86_64-native-linuxapp-gcc config
Configuration done
.
---

 Am I right?

 But it fails... I'll need to stick with DPDK-1.8 for CentOS 6 for now,
since I can compile it without problems...

 Thanks anyway man!

Best,
Thiago

[dpdk-dev] RFC: DPDK Long Term Support

2016-06-03 Thread Matthew Hall

On Fri, Jun 03, 2016 at 03:07:49PM +, Mcnamara, John wrote:
> What changes should be backported
> -
> 
> * Bug fixes that don't break the ABI.
> 
> 
> What changes should not be backported
> -
> 
> * API or ABI breaking changes.

I think this part needs some adjusting.

It seems like there should be allowance for bug fixes where the original does 
break ABI but it is possible to make a version that doesn't.

A lot of DPDK bug fixes I see would fall into this category and it isn't 
discussed.

Matthew.

[dpdk-dev] [PATCH v2 3/3] examples/l2fwd-crypto: enable AES counter mode cipher algorithm

2016-06-03 Thread Fan Zhang

This patch enables AES counter mode algorithm support to l2fwd-crypto
sample application.

Signed-off-by: Fan Zhang 
---
 examples/l2fwd-crypto/main.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/examples/l2fwd-crypto/main.c b/examples/l2fwd-crypto/main.c
index d18c813..66fc874 100644
--- a/examples/l2fwd-crypto/main.c
+++ b/examples/l2fwd-crypto/main.c
@@ -352,6 +352,7 @@ fill_supported_algorithm_tables(void)
strcpy(supported_cipher_algo[i], "NOT_SUPPORTED");

strcpy(supported_cipher_algo[RTE_CRYPTO_CIPHER_AES_CBC], "AES_CBC");
+   strcpy(supported_cipher_algo[RTE_CRYPTO_CIPHER_AES_CTR], "AES_CTR");
strcpy(supported_cipher_algo[RTE_CRYPTO_CIPHER_AES_GCM], "AES_GCM");
strcpy(supported_cipher_algo[RTE_CRYPTO_CIPHER_NULL], "NULL");
strcpy(supported_cipher_algo[RTE_CRYPTO_CIPHER_SNOW3G_UEA2], 
"SNOW3G_UEA2");
-- 
2.5.5

[dpdk-dev] [PATCH v2 2/3] app/test: add aes-ni multi-buffer pmd test cases for AES CTR

2016-06-03 Thread Fan Zhang

Added tests cases for AES-NI MB PMD working in counter mode.

Signed-off-by: Fan Zhang 
---
 app/test/test_cryptodev.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/app/test/test_cryptodev.c b/app/test/test_cryptodev.c
index 03d6f02..45e6daa 100644
--- a/app/test/test_cryptodev.c
+++ b/app/test/test_cryptodev.c
@@ -4649,6 +4649,19 @@ static struct unit_test_suite 
cryptodev_aesni_mb_testsuite  = {
test_AES_CBC_HMAC_SHA1_encrypt_digest_sessionless),

TEST_CASE_ST(ut_setup, ut_teardown,
+   test_AES_CTR_encrypt_digest_case_1),
+   TEST_CASE_ST(ut_setup, ut_teardown,
+   test_AES_CTR_encrypt_digest_case_2),
+   TEST_CASE_ST(ut_setup, ut_teardown,
+   test_AES_CTR_encrypt_digest_case_3),
+   TEST_CASE_ST(ut_setup, ut_teardown,
+   test_AES_CTR_digest_verify_decrypt_case_1),
+   TEST_CASE_ST(ut_setup, ut_teardown,
+   test_AES_CTR_digest_verify_decrypt_case_2),
+   TEST_CASE_ST(ut_setup, ut_teardown,
+   test_AES_CTR_digest_verify_decrypt_case_3),
+
+   TEST_CASE_ST(ut_setup, ut_teardown,
test_not_in_place_crypto),

TEST_CASES_END() /**< NULL terminate unit test array */
-- 
2.5.5

[dpdk-dev] [PATCH v2 1/3] aesni_mb: add counter mode support

2016-06-03 Thread Fan Zhang

This patch provides counter mode support to AES-NI multi-buffer library.

The following cipher algorithm is enabled:
- RTE_CRYPTO_CIPHER_AES_CTR

Signed-off-by: Fan Zhang 
---
 doc/guides/cryptodevs/aesni_mb.rst |  3 +++
 doc/guides/cryptodevs/overview.rst |  6 +++---
 doc/guides/rel_notes/release_16_07.rst |  5 +
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c |  3 +++
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c | 20 
 5 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/doc/guides/cryptodevs/aesni_mb.rst 
b/doc/guides/cryptodevs/aesni_mb.rst
index fd5414d..60a8914 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -48,6 +48,9 @@ Cipher algorithms:
 * RTE_CRYPTO_SYM_CIPHER_AES128_CBC
 * RTE_CRYPTO_SYM_CIPHER_AES192_CBC
 * RTE_CRYPTO_SYM_CIPHER_AES256_CBC
+* RTE_CRYPTO_SYM_CIPHER_AES128_CTR
+* RTE_CRYPTO_SYM_CIPHER_AES192_CTR
+* RTE_CRYPTO_SYM_CIPHER_AES256_CTR

 Hash algorithms:

diff --git a/doc/guides/cryptodevs/overview.rst 
b/doc/guides/cryptodevs/overview.rst
index e1f33e1..4a84146 100644
--- a/doc/guides/cryptodevs/overview.rst
+++ b/doc/guides/cryptodevs/overview.rst
@@ -55,9 +55,9 @@ Supported Cipher Algorithms
"AES_CBC_128",x,,x,,
"AES_CBC_192",x,,x,,
"AES_CBC_256",x,,x,,
-   "AES_CTR_128",x
-   "AES_CTR_192",x
-   "AES_CTR_256",x
+   "AES_CTR_128",x,,x,,
+   "AES_CTR_192",x,,x,,
+   "AES_CTR_256",x,,x,,
"SNOW3G_UEA2",xx

 Supported Authentication Algorithms
diff --git a/doc/guides/rel_notes/release_16_07.rst 
b/doc/guides/rel_notes/release_16_07.rst
index 565055e..307e7c4 100644
--- a/doc/guides/rel_notes/release_16_07.rst
+++ b/doc/guides/rel_notes/release_16_07.rst
@@ -47,6 +47,11 @@ New Features
   * Dropped specific Xen Dom0 code.
   * Dropped specific anonymous mempool code in testpmd.

+* **Added AES-CTR support to AESNI MB PMD.**
+
+  Now AESNI MB PMD supports 128/192/256-bit counter mode AES encryption and
+  decryption.
+
 * **Added support of AES counter mode for Intel QuickAssist devices.**

   Enabled support for the AES CTR algorithm for Intel QuickAssist devices.
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c 
b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
index 3415ac1..ce763bf 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
@@ -222,6 +222,9 @@ aesni_mb_set_session_cipher_parameters(const struct 
aesni_mb_ops *mb_ops,
case RTE_CRYPTO_CIPHER_AES_CBC:
sess->cipher.mode = CBC;
break;
+   case RTE_CRYPTO_CIPHER_AES_CTR:
+   sess->cipher.mode = CNTR;
+   break;
default:
MB_LOG_ERR("Unsupported cipher mode parameter");
return -1;
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c 
b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
index 3806a66..d3c46ac 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
@@ -207,6 +207,26 @@ static const struct rte_cryptodev_capabilities 
aesni_mb_pmd_capabilities[] = {
}, }
}, }
},
+   {   /* AES CTR */
+   .op = RTE_CRYPTO_OP_TYPE_SYMMETRIC,
+   {.sym = {
+   .xform_type = RTE_CRYPTO_SYM_XFORM_CIPHER,
+   {.cipher = {
+   .algo = RTE_CRYPTO_CIPHER_AES_CTR,
+   .block_size = 16,
+   .key_size = {
+   .min = 16,
+   .max = 32,
+   .increment = 8
+   },
+   .iv_size = {
+   .min = 16,
+   .max = 16,
+   .increment = 0
+   }
+   }, }
+   }, }
+   },
RTE_CRYPTODEV_END_OF_CAPABILITIES_LIST()
 };

-- 
2.5.5

[dpdk-dev] [PATCH v2 0/3] Add AES Counter mode support for AES-NI MB PMD

2016-06-03 Thread Fan Zhang

This patchset adds counter mode support to AES-NI multi-buffer library
and appropriate test cases. 

This patchset depends on the following patch/patchsets
"doc: fix supported AES-CBC key lengths"
(http://dpdk.org/dev/patchwork/patch/12398/)
and "Added AES counter mode capability"
(http://dpdk.org/ml/archives/dev/2016-May/038364.html)

v2:
*added AES counter mode support to l2fwd-crypto sample application

Fan Zhang (3):
  aesni_mb: add counter mode support
  app/test: add aes-ni multi-buffer pmd test cases for AES CTR
  examples/l2fwd-crypto: enable AES counter mode cipher algorithm

 app/test/test_cryptodev.c  | 13 +
 doc/guides/cryptodevs/aesni_mb.rst |  3 +++
 doc/guides/cryptodevs/overview.rst |  6 +++---
 doc/guides/rel_notes/release_16_07.rst |  5 +
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c |  3 +++
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c | 20 
 examples/l2fwd-crypto/main.c   |  1 +
 7 files changed, 48 insertions(+), 3 deletions(-)

-- 
2.5.5

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Bruce Richardson

On Fri, Jun 03, 2016 at 10:57:22AM +0100, Bruce Richardson wrote:
> On Thu, Jun 02, 2016 at 07:17:05PM -0700, Matthew Hall wrote:
> > On Thu, Jun 02, 2016 at 06:34:58PM -0400, Neil Horman wrote:
> > > > This sort of code is very 1970s / ioctl / messy binary. And doesn't buy 
> > > > any 
> > > > performance advantage because it's just for config.
> > > > 
> > > What!?  I can't even parse that sentence.
> > 
> > I would not want to have to use the structure you proposed in user-readable 
> > code. It looked a lot like ugly ioctl stuff and I found the sysctl style 
> > interface easier to read. I don't see why that would be hard for anyone to 
> > parse but nevertheless.
> > 
> > > > https://www.freebsd.org/cgi/man.cgi?sysctl(3)
> > > > 
> > > I can't even begin to understand what you're after here.  sysctl provides 
> > > a
> > > heirarchy in _exactly_ the same way that I just proposed, by texual 
> > > consistency
> > > in naming.
> > 
> > I didn't object to the hierarchy part, but the user hostility of the 
> > example 
> > proposed.
> > 
> > > > http://json-c.github.io/json-c/json-c-0.12/doc/html/json__object_8h.html
> > > > 
> > > So, this is a fine interface to convert text config to a code format, but 
> > > thats
> > > a decision that application should be making, not something dpdk should 
> > > mandate
> > 
> > You're thinking way too narrowly here for what I am working to convey. I 
> > wasn't meaning to say JSON had to be used. I was saying, the kind of 
> > lightweight object-based API they used for modeling JSON has worked very 
> > well 
> > for modeling config data inside of my app. IE, simple functions for working 
> > with the following sort of entities (which are used in many file / 
> > interchange 
> > systems like JSON, MsgPack, YAML, etc.):
> > 
> > Objects:
> > * hashes, arbitrarily nested
> > * arrays, arbitrarily nested
> > 
> > Atoms:
> > * strings - textual
> > * strings - binary (something we should add for DPDK)
> > * integers
> > * floats / doubles
> > * booleans
> > 
> > In general I am seeing two good approaches for nesting:
> > 
> > 1. name nesting like MIB variable "x.y.z.a.b.c" - this is how sysctl works
> > 2. object nesting- this is how JSON, YAML, MsgPack, INI (implicitly w/ 
> > section 
> > names), XML etc. work...
> > 
> > to express this in the Python / Ruby / JS style syntax it would be:
> > 
> > config['x']['y']['z']['a']['b']['c']
> > using json-c it would be like
> > 
> > json_object_object_get()... until a json_object_TYPE_get().
> > 
> > What I've done for these in the past, is to make something that can parse 
> > the 
> > sysctl-style name x.y.z.0.a.b.c, detect if each dotted-item is a string, in 
> > which case reach inside the dict for the string or return NULL if not 
> > found, 
> > and if it's a number reach inside the array for that index and return NULL 
> > if 
> > not found. Here is a Python example how to take the sysctl style and look 
> > it 
> > up inside some objects. The same thing could be done using anything with at 
> > least as rich of features as what json-c provides...
> > 
> > RE_IS_INT = re.compile('^[0-9]+$')
> > def retrieve_path(data, path):
> > if isinstance(path, basestring):
> > path = path.split('.')
> > 
> > if isinstance(data, Mapping):
> > result = data.get(path[0])
> > else:
> > if not RE_IS_INT.match(str(path[0])):
> > return None
> > i = int(path[0])
> > result = data[i] if len(data) > i else None
> > 
> > if len(path) == 1:
> > return result
> > else:
> > if result:
> > return fetch(result, path[1:])
> > else:
> > return None
> > 
> > > Neil
> > 
> > Matthew
> 
> I'm afraid I don't see the need to expand out to such a large range of types, 
> or
> to add object-type nesting. I'm a big fan of simplicity, and I think Neils
> original suggestion of basic name-value pairs is a good one to start with. The
> dot notation should work fine for any hierarchies we want to have. If we get
> beyond having 2 levels in a hierarchy of config, I think we may have gone
> overboard in making things too fine-grained configurable!
> 
Minor correction: re-reading my mail afterwards, I realise I actually meant 
2 dots in the name, ie. 2 sublevels, or 3 levels, rather than 2 levels! Must
proofread more.

However, I expect most folks still got my point despite the typo! :-)

/Bruce

[dpdk-dev] [PATCH v2] e1000: configure VLAN TPID

2016-06-03 Thread Beilei Xing

This patch enables configuring the ether types of both inner and
outer VLANs. Note that outer TPID of single VLAN and inner TPID of
double VLAN are read only.

Signed-off-by: Beilei Xing 
---
v2 changes:
 Modify return value. Cause inner tpid is not supported by single vlan,
 return -ENOTSUP.
 Add return value. If want to set inner TPID of double vlan or set outer
 tpid of single vlan, return -ENOTSUP.

 drivers/net/e1000/igb_ethdev.c | 38 +++---
 1 file changed, 35 insertions(+), 3 deletions(-)

diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index f0921ee..5d37e2c 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -86,6 +86,14 @@
 #define E1000_INCVALUE_82576 (16 << IGB_82576_TSYNC_SHIFT)
 #define E1000_TSAUXC_DISABLE_SYSTIME 0x8000

+/* CTRL_EXT bit mask*/
+#define E1000_CTRL_EXT_EXT_VLAN  (1 << 26)
+
+/* VLAN Ether Type bit mask */
+#define E1000_VET_VET_EXT0x
+
+#define E1000_VET_VET_EXT_SHIFT  16
+
 static int  eth_igb_configure(struct rte_eth_dev *dev);
 static int  eth_igb_start(struct rte_eth_dev *dev);
 static void eth_igb_stop(struct rte_eth_dev *dev);
@@ -2237,13 +2245,37 @@ eth_igb_vlan_tpid_set(struct rte_eth_dev *dev,
 {
struct e1000_hw *hw =
E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
-   uint32_t reg = ETHER_TYPE_VLAN;
+   uint32_t reg;
+   uint32_t qinq;
int ret = 0;

+   qinq = E1000_READ_REG(hw, E1000_CTRL_EXT);
+   qinq &= E1000_CTRL_EXT_EXT_VLAN;
+
switch (vlan_type) {
case ETH_VLAN_TYPE_INNER:
-   reg |= (tpid << 16);
-   E1000_WRITE_REG(hw, E1000_VET, reg);
+   if (qinq) {
+   ret = -ENOTSUP;
+   PMD_DRV_LOG(WARNING,
+   "Inner vlan ether type is read-only\n");
+   } else {
+   ret = -ENOTSUP;
+   PMD_DRV_LOG(ERR, "Inner type is not supported"
+   " by single vlan\n");
+   }
+   break;
+   case ETH_VLAN_TYPE_OUTER:
+   if (qinq) {
+   reg = E1000_READ_REG(hw, E1000_VET);
+   reg = (reg & (~E1000_VET_VET_EXT)) |
+   ((uint32_t)tpid << E1000_VET_VET_EXT_SHIFT);
+   E1000_WRITE_REG(hw, E1000_VET, reg);
+   } else {
+   ret = -ENOTSUP;
+   PMD_DRV_LOG(WARNING,
+   "Single vlan ether type is read-only\n");
+   }
+
break;
default:
ret = -EINVAL;
-- 
2.5.0

[dpdk-dev] [PATCH v2] ixgbe: configure VLAN TPID

2016-06-03 Thread Beilei Xing

The patch enables configuring the ether types of both inner and
outer VLANs.

Signed-off-by: Beilei Xing 
---
v2 changes:
 Modify return value. since inner tpid is not supported by single vlan,
 return -ENOTSUP.
 Change meanning of function vlan_tpid_set.

 drivers/net/ixgbe/ixgbe_ethdev.c | 30 --
 lib/librte_ether/rte_ethdev.h|  4 ++--
 2 files changed, 30 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index a2b170b..0fe0dc4 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -157,6 +157,8 @@ enum ixgbevf_xcast_modes {
IXGBEVF_XCAST_MODE_ALLMULTI,
 };

+#define IXGBE_EXVET_VET_EXT_SHIFT  16
+
 static int eth_ixgbe_dev_init(struct rte_eth_dev *eth_dev);
 static int eth_ixgbe_dev_uninit(struct rte_eth_dev *eth_dev);
 static int  ixgbe_dev_configure(struct rte_eth_dev *dev);
@@ -1576,11 +1578,35 @@ ixgbe_vlan_tpid_set(struct rte_eth_dev *dev,
struct ixgbe_hw *hw =
IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
int ret = 0;
+   uint32_t reg;
+   uint32_t qinq;
+
+   qinq = IXGBE_READ_REG(hw, IXGBE_DMATXCTL);
+   qinq &= IXGBE_DMATXCTL_GDV;

switch (vlan_type) {
case ETH_VLAN_TYPE_INNER:
-   /* Only the high 16-bits is valid */
-   IXGBE_WRITE_REG(hw, IXGBE_EXVET, tpid << 16);
+   if (qinq) {
+   reg = IXGBE_READ_REG(hw, IXGBE_VLNCTRL);
+   reg = (reg & (~IXGBE_VLNCTRL_VET)) | (uint32_t)tpid;
+   IXGBE_WRITE_REG(hw, IXGBE_VLNCTRL, reg);
+   } else {
+   ret = -ENOTSUP;
+   PMD_DRV_LOG(ERR, "Inner type is not supported"
+   " by single vlan\n");
+   }
+   break;
+   case ETH_VLAN_TYPE_OUTER:
+   if (qinq) {
+   /* Only the high 16-bits is valid */
+   IXGBE_WRITE_REG(hw, IXGBE_EXVET, (uint32_t)tpid <<
+   IXGBE_EXVET_VET_EXT_SHIFT);
+   } else {
+   reg = IXGBE_READ_REG(hw, IXGBE_VLNCTRL);
+   reg = (reg & (~IXGBE_VLNCTRL_VET)) | (uint32_t)tpid;
+   IXGBE_WRITE_REG(hw, IXGBE_VLNCTRL, reg);
+   }
+
break;
default:
ret = -EINVAL;
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 2757510..57855f4 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -1124,7 +1124,7 @@ typedef int (*vlan_filter_set_t)(struct rte_eth_dev *dev,

 typedef int (*vlan_tpid_set_t)(struct rte_eth_dev *dev,
   enum rte_vlan_type type, uint16_t tpid);
-/**< @internal set the outer VLAN-TPID by an Ethernet device. */
+/**< @internal set the outer/inner VLAN-TPID by an Ethernet device. */

 typedef void (*vlan_offload_set_t)(struct rte_eth_dev *dev, int mask);
 /**< @internal set VLAN offload function by an Ethernet device. */
@@ -1408,7 +1408,7 @@ struct eth_dev_ops {
/**< Get packet types supported and identified by device*/
mtu_set_t  mtu_set; /**< Set MTU. */
vlan_filter_set_t  vlan_filter_set;  /**< Filter VLAN Setup. */
-   vlan_tpid_set_tvlan_tpid_set;  /**< Outer VLAN TPID 
Setup. */
+   vlan_tpid_set_tvlan_tpid_set;  /**< Outer/Inner VLAN 
TPID Setup. */
vlan_strip_queue_set_t vlan_strip_queue_set; /**< VLAN Stripping on 
queue. */
vlan_offload_set_t vlan_offload_set; /**< Set VLAN Offload. */
vlan_pvid_set_tvlan_pvid_set; /**< Set port based TX VLAN 
insertion */
-- 
2.5.0

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Bruce Richardson

On Thu, Jun 02, 2016 at 07:17:05PM -0700, Matthew Hall wrote:
> On Thu, Jun 02, 2016 at 06:34:58PM -0400, Neil Horman wrote:
> > > This sort of code is very 1970s / ioctl / messy binary. And doesn't buy 
> > > any 
> > > performance advantage because it's just for config.
> > > 
> > What!?  I can't even parse that sentence.
> 
> I would not want to have to use the structure you proposed in user-readable 
> code. It looked a lot like ugly ioctl stuff and I found the sysctl style 
> interface easier to read. I don't see why that would be hard for anyone to 
> parse but nevertheless.
> 
> > > https://www.freebsd.org/cgi/man.cgi?sysctl(3)
> > > 
> > I can't even begin to understand what you're after here.  sysctl provides a
> > heirarchy in _exactly_ the same way that I just proposed, by texual 
> > consistency
> > in naming.
> 
> I didn't object to the hierarchy part, but the user hostility of the example 
> proposed.
> 
> > > http://json-c.github.io/json-c/json-c-0.12/doc/html/json__object_8h.html
> > > 
> > So, this is a fine interface to convert text config to a code format, but 
> > thats
> > a decision that application should be making, not something dpdk should 
> > mandate
> 
> You're thinking way too narrowly here for what I am working to convey. I 
> wasn't meaning to say JSON had to be used. I was saying, the kind of 
> lightweight object-based API they used for modeling JSON has worked very well 
> for modeling config data inside of my app. IE, simple functions for working 
> with the following sort of entities (which are used in many file / 
> interchange 
> systems like JSON, MsgPack, YAML, etc.):
> 
> Objects:
> * hashes, arbitrarily nested
> * arrays, arbitrarily nested
> 
> Atoms:
> * strings - textual
> * strings - binary (something we should add for DPDK)
> * integers
> * floats / doubles
> * booleans
> 
> In general I am seeing two good approaches for nesting:
> 
> 1. name nesting like MIB variable "x.y.z.a.b.c" - this is how sysctl works
> 2. object nesting- this is how JSON, YAML, MsgPack, INI (implicitly w/ 
> section 
> names), XML etc. work...
> 
> to express this in the Python / Ruby / JS style syntax it would be:
> 
> config['x']['y']['z']['a']['b']['c']
> using json-c it would be like
> 
> json_object_object_get()... until a json_object_TYPE_get().
> 
> What I've done for these in the past, is to make something that can parse the 
> sysctl-style name x.y.z.0.a.b.c, detect if each dotted-item is a string, in 
> which case reach inside the dict for the string or return NULL if not found, 
> and if it's a number reach inside the array for that index and return NULL if 
> not found. Here is a Python example how to take the sysctl style and look it 
> up inside some objects. The same thing could be done using anything with at 
> least as rich of features as what json-c provides...
> 
> RE_IS_INT = re.compile('^[0-9]+$')
> def retrieve_path(data, path):
> if isinstance(path, basestring):
> path = path.split('.')
> 
> if isinstance(data, Mapping):
> result = data.get(path[0])
> else:
> if not RE_IS_INT.match(str(path[0])):
> return None
> i = int(path[0])
> result = data[i] if len(data) > i else None
> 
> if len(path) == 1:
> return result
> else:
> if result:
> return fetch(result, path[1:])
> else:
> return None
> 
> > Neil
> 
> Matthew

I'm afraid I don't see the need to expand out to such a large range of types, or
to add object-type nesting. I'm a big fan of simplicity, and I think Neils
original suggestion of basic name-value pairs is a good one to start with. The
dot notation should work fine for any hierarchies we want to have. If we get
beyond having 2 levels in a hierarchy of config, I think we may have gone
overboard in making things too fine-grained configurable!

/Bruce

[dpdk-dev] [PATCH v3] virtio: split virtio rx/tx queue

2016-06-03 Thread Yuanhan Liu

On Thu, Jun 02, 2016 at 12:12:13AM +0800, Huawei Xie wrote:
> We keep a common vq structure, containing only vq related fields,
> and then split others into RX, TX and control queue respectively.
> 
> Signed-off-by: Huawei Xie 

Acked-by: Yuanhan Liu 

And applied to dpdk-next-virtio.

Thanks.

--yliu

[dpdk-dev] [PATCH] eal/x86: fix clang build with -O0

2016-06-03 Thread Thomas Monjalon

2016-06-03 10:16, Olivier MATZ:
> On 06/03/2016 10:15 AM, Thomas Monjalon wrote:
> > From: Damjan Marion 
> > 
> > Clang seems to have a bug with asm inside inline function rte_xabort():
> > 
> > rte_rtm.h:56:15: error: invalid operand for inline asm constraint 'i'
> > asm volatile(".byte 0xc6,0xf8,%P0" :: "i" (status) : "memory");
> >  ^
> > 
> > It is seen only when building with EXTRA_CFLAGS=-O0.
> > 
> > The workaround is to replace the inline function by a macro.
> > 
> > Fixes: ba7468997ea6 ("spinlock: add HTM lock elision for x86")
> > 
> > Signed-off-by: Damjan Marion 
> > Signed-off-by: Thomas Monjalon 
> 
> Generally speaking, it's not a good idea to replace a static inline by a
> macro because it relaxes type checking. Looks ok in that case to
> workaround the clang issue.
> 
> Acked-by: Olivier Matz 

Thanks,
This patch has been appplied very quickly for the purpose of a live demo
at a VPP event: https://wiki.fd.io/view/Events

[dpdk-dev] [PATCH] eal/x86: fix clang build with -O0

2016-06-03 Thread Olivier MATZ

Hi Thomas,

On 06/03/2016 10:15 AM, Thomas Monjalon wrote:
> From: Damjan Marion 
> 
> Clang seems to have a bug with asm inside inline function rte_xabort():
> 
> rte_rtm.h:56:15: error: invalid operand for inline asm constraint 'i'
> asm volatile(".byte 0xc6,0xf8,%P0" :: "i" (status) : "memory");
>  ^
> 
> It is seen only when building with EXTRA_CFLAGS=-O0.
> 
> The workaround is to replace the inline function by a macro.
> 
> Fixes: ba7468997ea6 ("spinlock: add HTM lock elision for x86")
> 
> Signed-off-by: Damjan Marion 
> Signed-off-by: Thomas Monjalon 
> ---
>  lib/librte_eal/common/include/arch/x86/rte_rtm.h | 9 -
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/librte_eal/common/include/arch/x86/rte_rtm.h 
> b/lib/librte_eal/common/include/arch/x86/rte_rtm.h
> index d935641..0649f79 100644
> --- a/lib/librte_eal/common/include/arch/x86/rte_rtm.h
> +++ b/lib/librte_eal/common/include/arch/x86/rte_rtm.h
> @@ -50,11 +50,10 @@ void rte_xend(void)
>asm volatile(".byte 0x0f,0x01,0xd5" ::: "memory");
>  }
>  
> -static __attribute__((__always_inline__)) inline
> -void rte_xabort(const unsigned int status)
> -{
> - asm volatile(".byte 0xc6,0xf8,%P0" :: "i" (status) : "memory");
> -}
> +/* not an inline function to workaround a clang bug with -O0 */
> +#define rte_xabort(status) do { \
> + asm volatile(".byte 0xc6,0xf8,%P0" :: "i" (status) : "memory"); \
> +} while (0)
>  
>  static __attribute__((__always_inline__)) inline
>  int rte_xtest(void)
> 

Generally speaking, it's not a good idea to replace a static inline by a
macro because it relaxes type checking. Looks ok in that case to
workaround the clang issue.

Acked-by: Olivier Matz

[dpdk-dev] [PATCH] eal/x86: fix clang build with -O0

2016-06-03 Thread Thomas Monjalon

From: Damjan Marion 

Clang seems to have a bug with asm inside inline function rte_xabort():

rte_rtm.h:56:15: error: invalid operand for inline asm constraint 'i'
asm volatile(".byte 0xc6,0xf8,%P0" :: "i" (status) : "memory");
 ^

It is seen only when building with EXTRA_CFLAGS=-O0.

The workaround is to replace the inline function by a macro.

Fixes: ba7468997ea6 ("spinlock: add HTM lock elision for x86")

Signed-off-by: Damjan Marion 
Signed-off-by: Thomas Monjalon 
---
 lib/librte_eal/common/include/arch/x86/rte_rtm.h | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/lib/librte_eal/common/include/arch/x86/rte_rtm.h 
b/lib/librte_eal/common/include/arch/x86/rte_rtm.h
index d935641..0649f79 100644
--- a/lib/librte_eal/common/include/arch/x86/rte_rtm.h
+++ b/lib/librte_eal/common/include/arch/x86/rte_rtm.h
@@ -50,11 +50,10 @@ void rte_xend(void)
 asm volatile(".byte 0x0f,0x01,0xd5" ::: "memory");
 }

-static __attribute__((__always_inline__)) inline
-void rte_xabort(const unsigned int status)
-{
-   asm volatile(".byte 0xc6,0xf8,%P0" :: "i" (status) : "memory");
-}
+/* not an inline function to workaround a clang bug with -O0 */
+#define rte_xabort(status) do { \
+   asm volatile(".byte 0xc6,0xf8,%P0" :: "i" (status) : "memory"); \
+} while (0)

 static __attribute__((__always_inline__)) inline
 int rte_xtest(void)
-- 
2.7.0

[dpdk-dev] [PATCH 4/6] testpmd: handle all rxqs in rss setup

2016-06-03 Thread Wang, Zhihong



> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Wang, Zhihong
> Sent: Thursday, May 26, 2016 10:55 AM
> To: Thomas Monjalon 
> Cc: dev at dpdk.org; Ananyev, Konstantin ;
> Richardson, Bruce ; De Lara Guarch, Pablo
> 
> Subject: Re: [dpdk-dev] [PATCH 4/6] testpmd: handle all rxqs in rss setup
> 
> 
> 
> > -Original Message-
> > From: Thomas Monjalon [mailto:thomas.monjalon at 6wind.com]
> > Sent: Wednesday, May 25, 2016 5:42 PM
> > To: Wang, Zhihong 
> > Cc: dev at dpdk.org; Ananyev, Konstantin ;
> > Richardson, Bruce ; De Lara Guarch, Pablo
> > 
> > Subject: Re: [PATCH 4/6] testpmd: handle all rxqs in rss setup
> >
> > 2016-05-05 18:46, Zhihong Wang:
> > > This patch removes constraints in rxq handling when multiqueue is enabled
> > > to handle all the rxqs.
> > >
> > > Current testpmd forces a dedicated core for each rxq, some rxqs may be
> > > ignored when core number is less than rxq number, and that causes
> confusion
> > > and inconvenience.
> >
> > I have the feeling that "constraints", "confusion" and "inconvenience"
> > should be more explained.
> > Please give some examples with not enough and too much cores. Thanks
> 
> Sure, will add detailed description in v2  ;)

V2 has been sent.
We see increasing examples looking for help on this "confusion",
one recent example:
http://openvswitch.org/pipermail/dev/2016-June/072110.html

[dpdk-dev] [PATCH] vhost: fix segfault on bad descriptor address.

2016-06-03 Thread Ilya Maximets

On 02.06.2016 19:22, Rich Lane wrote:
> On Thu, Jun 2, 2016 at 3:46 AM, Ilya Maximets  > wrote:
> 
> Hi, Rich.
> Thank you for testing and analysing.
> 
> On 01.06.2016 01:06, Rich Lane wrote:
> > On Fri, May 20, 2016 at 5:50 AM, Ilya Maximets  samsung.com   samsung.com >> wrote:
> >
> > In current implementation guest application can reinitialize vrings
> > by executing start after stop. In the same time host application
> > can still poll virtqueue while device stopped in guest and it will
> > crash with segmentation fault while vring reinitialization because
> > of dereferencing of bad descriptor addresses.
> >
> >
> > I see a performance regression with this patch at large packet sizes (> 
> 768 bytes). rte_vhost_enqueue_burst is consuming 10% more cycles. Strangely, 
> there's actually a ~1% performance improvement at small packet sizes.
> >
> > The regression happens with GCC 4.8.4 and 5.3.0, but not 6.1.1.
> >
> > AFAICT this is just the compiler generating bad code. One difference is 
> that it's storing the offset on the stack instead of in a register. A 
> workaround is to move the !desc_addr check outside the unlikely macros.
> >
> > --- a/lib/librte_vhost/vhost_rxtx.c
> > +++ b/lib/librte_vhost/vhost_rxtx.c
> > @@ -147,10 +147,10 @@ copy_mbuf_to_desc(struct virtio_net *dev, 
> struct vhost_virtqueue *vq,
> > struct virtio_net_hdr_mrg_rxbuf virtio_hdr = {{0, 0, 0, 0, 
> 0, 0}, 0};
> >
> > desc = >desc[desc_idx];
> > -   if (unlikely(desc->len < vq->vhost_hlen))
> > +   desc_addr = gpa_to_vva(dev, desc->addr);
> > +   if (unlikely(desc->len < vq->vhost_hlen || !desc_addr))
> >
> >
> >  Workaround: change to "if (unlikely(desc->len < vq->vhost_hlen) || 
> !desc_addr)".
> >
> > return -1;
> >
> >
> > -   desc_addr = gpa_to_vva(dev, desc->addr);
> > rte_prefetch0((void *)(uintptr_t)desc_addr);
> >
> > virtio_enqueue_offload(m, _hdr.hdr);
> > @@ -184,6 +184,9 @@ copy_mbuf_to_desc(struct virtio_net *dev, 
> struct vhost_virtqueue *vq,
> >
> > desc = >desc[desc->next];
> > desc_addr   = gpa_to_vva(dev, desc->addr);
> > +   if (unlikely(!desc_addr))
> >
> >
> > Workaround: change to "if (!desc_addr)".
> >
> >
> > +   return -1;
> > +
> > desc_offset = 0;
> > desc_avail  = desc->len;
> > }
> >
> 
> What about other places? Is there same issues or it's only inside 
> copy_mbuf_to_desc() ?
> 
> 
> Only copy_mbuf_to_desc has the issue.

Ok.
Actually, I can't reproduce this performance issue using gcc 4.8.5 from RHEL 
7.2.
I'm not sure if I should post v2 with above fixes. May be them could be applied
while pushing patch to repository? 

Best regards, Ilya Maximets.

[dpdk-dev] [PATCH 1/3] vhost: pre update used ring for Tx and Rx

2016-06-03 Thread Xie, Huawei

On 6/1/2016 2:53 PM, Yuanhan Liu wrote:
> On Wed, Jun 01, 2016 at 06:40:41AM +, Xie, Huawei wrote:
>>> /* Retrieve all of the head indexes first to avoid caching issues. */
>>> for (i = 0; i < count; i++) {
>>> -   desc_indexes[i] = vq->avail->ring[(vq->last_used_idx + i) &
>>> -   (vq->size - 1)];
>>> +   used_idx = (vq->last_used_idx + i) & (vq->size - 1);
>>> +   desc_indexes[i] = vq->avail->ring[used_idx];
>>> +
>>> +   vq->used->ring[used_idx].id  = desc_indexes[i];
>>> +   vq->used->ring[used_idx].len = 0;
>>> +   vhost_log_used_vring(dev, vq,
>>> +   offsetof(struct vring_used, ring[used_idx]),
>>> +   sizeof(vq->used->ring[used_idx]));
>>> }
>>>  
>>> /* Prefetch descriptor index. */
>>> rte_prefetch0(>desc[desc_indexes[0]]);
>>> -   rte_prefetch0(>used->ring[vq->last_used_idx & (vq->size - 1)]);
>>> -
>>> for (i = 0; i < count; i++) {
>>> int err;
>>>  
>>> -   if (likely(i + 1 < count)) {
>>> +   if (likely(i + 1 < count))
>>> rte_prefetch0(>desc[desc_indexes[i + 1]]);
>>> -   rte_prefetch0(>used->ring[(used_idx + 1) &
>>> - (vq->size - 1)]);
>>> -   }
>>>  
>>> pkts[i] = rte_pktmbuf_alloc(mbuf_pool);
>>> if (unlikely(pkts[i] == NULL)) {
>>> @@ -916,18 +920,12 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
>>> rte_pktmbuf_free(pkts[i]);
>>> break;
>>> }
>>> -
>>> -   used_idx = vq->last_used_idx++ & (vq->size - 1);
>>> -   vq->used->ring[used_idx].id  = desc_indexes[i];
>>> -   vq->used->ring[used_idx].len = 0;
>>> -   vhost_log_used_vring(dev, vq,
>>> -   offsetof(struct vring_used, ring[used_idx]),
>>> -   sizeof(vq->used->ring[used_idx]));
>>> }
>> Had tried post-updating used ring in batch,  but forget the perf change.
> I would assume pre-updating gives better performance gain, as we are
> fiddling with avail and used ring together, which would be more cache
> friendly.

The distance between entry for avail ring and used ring are at least 8
cache lines.
The benefit comes from batch updates, if applicable.

>
>> One optimization would be on vhost_log_used_ring.
>> I have two ideas,
>> a) In QEMU side, we always assume use ring will be changed. so that we
>> don't need to log used ring in VHOST.
>>
>> Michael: feasible in QEMU? comments on this?
>>
>> b) We could always mark the total used ring modified rather than entry
>> by entry.
> I doubt it's worthwhile. One fact is that vhost_log_used_ring is
> a non operation in most time: it will take action only in the short
> gap of during live migration.
>
> And FYI, I even tried with all vhost_log_xxx being removed, it showed
> no performance boost at all. Therefore, it's not a factor that will
> impact performance.

I knew this.

>   --yliu
>

[dpdk-dev] [PATCH v3 12/13] enic: expand local Tx mbuf flags variable to 64-bits

2016-06-03 Thread Azarewicz, PiotrX T

> Coverity issue: 13218
> Fixes: fefed3d1e62c ("enic: new driver")
> 
> Suggested-by: Piotr Azarewicz 
> Signed-off-by: John Daley 
> ---
> This is essentially patch http://www.dpdk.org/dev/patchwork/patch/12642
> applied after the enic_send_packet function was melded into the main
> transmit funciton.
> 
Acked-by: Piotr Azarewicz

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Neil Horman

On Thu, Jun 02, 2016 at 07:17:05PM -0700, Matthew Hall wrote:
> On Thu, Jun 02, 2016 at 06:34:58PM -0400, Neil Horman wrote:
> > > This sort of code is very 1970s / ioctl / messy binary. And doesn't buy 
> > > any 
> > > performance advantage because it's just for config.
> > > 
> > What!?  I can't even parse that sentence.
> 
> I would not want to have to use the structure you proposed in user-readable 
> code. It looked a lot like ugly ioctl stuff and I found the sysctl style 
> interface easier to read. I don't see why that would be hard for anyone to 
> parse but nevertheless.
> 
I still don't understand how you are likening a typed variable length array to
an ioctl that uses unsigned long a single all encompasing argument, or how you
see a difference between the ASN.1 style naming that sysctl uses and the ASN.1
style naming that  I proposed.  But I suppose thats neither here nor there, we
can agree to disagree on those points.

> > > https://www.freebsd.org/cgi/man.cgi?sysctl(3)
> > > 
> > I can't even begin to understand what you're after here.  sysctl provides a
> > heirarchy in _exactly_ the same way that I just proposed, by texual 
> > consistency
> > in naming.
> 
> I didn't object to the hierarchy part, but the user hostility of the example 
> proposed.
> 
I take issue with that, I fail to see anything hostile about variable length
arrays of unioned types.  And if it iss to much, wrap a get/set api around it,
which I would expect anyway to ease lookup tasks.

> > > http://json-c.github.io/json-c/json-c-0.12/doc/html/json__object_8h.html
> > > 
> > So, this is a fine interface to convert text config to a code format, but 
> > thats
> > a decision that application should be making, not something dpdk should 
> > mandate
> 
> You're thinking way too narrowly here for what I am working to convey. I 
> wasn't meaning to say JSON had to be used. I was saying, the kind of 
> lightweight object-based API they used for modeling JSON has worked very well 
> for modeling config data inside of my app. IE, simple functions for working 
> with the following sort of entities (which are used in many file / 
> interchange 
> systems like JSON, MsgPack, YAML, etc.):
> 
> Objects:
> * hashes, arbitrarily nested
> * arrays, arbitrarily nested
> 
> Atoms:
> * strings - textual
> * strings - binary (something we should add for DPDK)
> * integers
> * floats / doubles
> * booleans
> 
> In general I am seeing two good approaches for nesting:
> 
> 1. name nesting like MIB variable "x.y.z.a.b.c" - this is how sysctl works
> 2. object nesting- this is how JSON, YAML, MsgPack, INI (implicitly w/ 
> section 
> names), XML etc. work...
> 
> to express this in the Python / Ruby / JS style syntax it would be:
> 
> config['x']['y']['z']['a']['b']['c']
> using json-c it would be like
> 
> json_object_object_get()... until a json_object_TYPE_get().
> 

soo, you want to borrow a parsing api for the sole purpose of using it as an
in-memory configuration mechanism?  I suppose thats fine, but as Bruce points
out in his note, the dpdk already has an api that can be repurposed for that
use.

> What I've done for these in the past, is to make something that can parse the 
> sysctl-style name x.y.z.0.a.b.c, detect if each dotted-item is a string, in 
> which case reach inside the dict for the string or return NULL if not found, 
> and if it's a number reach inside the array for that index and return NULL if 
> not found. Here is a Python example how to take the sysctl style and look it 
> up inside some objects. The same thing could be done using anything with at 
> least as rich of features as what json-c provides...
> 
> RE_IS_INT = re.compile('^[0-9]+$')
> def retrieve_path(data, path):
> if isinstance(path, basestring):
> path = path.split('.')
> 
> if isinstance(data, Mapping):
> result = data.get(path[0])
> else:
> if not RE_IS_INT.match(str(path[0])):
> return None
> i = int(path[0])
> result = data[i] if len(data) > i else None
> 
> if len(path) == 1:
> return result
> else:
> if result:
> return fetch(result, path[1:])
> else:
> return None
> 
> > Neil
> 

That seems500% more complicated than what dpdk really needs.  All it really
needs is key/value storage, with some minor semblance of a grouping mechanism,
and the abiilty to store some basic types (void * at
a minimum).  The caller should know implicitly based on the key lookup/set what
the type of the value is.  The API could be as simple as:

int setkey(char *key, void *value);
void *getkey(char *key);
void delkey(char *key);


It could be more complex of course, but at a minuimum, thats really enough, if
you didn't just want to expose a data structure to directly alter.

Neil

> Matthew
>

[dpdk-dev] [RFC] Yet another option for DPDK options

2016-06-03 Thread Neil Horman

On Fri, Jun 03, 2016 at 12:01:30PM +0100, Bruce Richardson wrote:
> On Fri, Jun 03, 2016 at 11:29:43AM +0100, Bruce Richardson wrote:
> > On Thu, Jun 02, 2016 at 04:08:37PM -0400, Neil Horman wrote:
> > > On Thu, Jun 02, 2016 at 07:41:10PM +, Wiles, Keith wrote:
> > > > 
> > > > On 6/2/16, 12:11 PM, "Neil Horman"  wrote:
> > > > 
> > > > >
> > > > >1) The definition of a config structure that can be passed to 
> > > > >rte_eal_init,
> > > > >defining the configuration for that running process
> > > > 
> > > > Having a configuration structure means we have to have an ABI change to 
> > > > that structure anytime we add or remove an option. I was thinking a 
> > > > very simple DB of some kind would be better. Have the code query the DB 
> > > > to obtain the needed information. The APIs used to query and set the DB 
> > > > needs to be very easy to use as well.
> > > 
> > > Thats a fair point.  A decent starting point is likely a simple struct 
> > > that
> > > looks like this:
> > > 
> > > struct key_vals {
> > >   char *key;
> > >   union {
> > >   ulong longval;
> > >   void *ptrval;
> > >   } value;
> > > };
> > > 
> > > struct config {
> > >   size_t count;
> > >   struct key_vals kvp[0];
> > > };
> > > 
> > > > 
> > > > Maybe each option can define its own structure if needed or just a 
> > > > simple variable type can be used for the basic types (int, string, 
> > > > bool, ?)
> > > > 
> > > Well, if you have config sections that require mulitiple elements, I'd 
> > > handle
> > > that with naming, i.e. if you have a config group that has an int and char
> > > value, I'd name them "group.intval", and "group.charval", so they are
> > > independently searchable, but linked from a nomenclature standpoint.
> > > 
> > > > Would this work better in the long run, does a fixed structure still 
> > > > make sense?
> > > > 
> > > No. I think you're ABI concerns are valid, but the above is likely a good
> > > starting point to address them.
> > > 
> > > Best
> > > Neil
> > 
> > I'll throw out one implementation idea here that I looked at previously, for
> > the reason that it was simple enough implement with existing code.
> > 
> > We already have the cfgfile library which works with name/value pairs read 
> > from
> > ini files on disk. However, it would be easy enough to add couple of APIs to
> > that to allow the user to "set" values inside an ini structure as well. With
> > that done we can then just add a new eal_init api which takes a single
> > "struct rte_cfgfile *" as parameter. For those apps that want to just use
> > inifiles for configuration straight, they can then do:
> > 
> > cfg = rte_cfgfile_load("my_cfg_file");
> > rte_eal_newinit(cfg);
> > 
> > Those who want a different config can instead do:
> > 
> > cfg = rte_cfgfile_new();
> > rte_cfgfile_add_section(cfg, "dpdk");
> > foreach_eal_setting_wanted:
> > rte_cfgfile_set(cfg, "dpdk", mysetting, myvalue);
> > rte_eal_newinit(cfg);
> > 
> From chatting to a couple of other DPDK dev's here I suspect I may not have
> been entirely clear here with this example. What is being shown above is 
> building
> up a "config-file" in memory - or rather a config structure which happens to
> have the idea of sections and values as an ini file has. There is no actual
> file ever being written to disk, and for those using any non-ini config file
> structure for their app, the code overhead of using the APIs above should be 
> pretty much the same as building up any other set of key-value pairs in
> memory to pass to an init function.
> 
> Hope this is a little clearer now.
> 
I'm fine with the idea of reusing the config file library that currently exists,
or more to the point, modifying it to be usable as a configuration API, rather
than a configuration file parser.  My primary interest is in separating the user
configuration mechanism from the internal library configuration lookup
mechanism.  What I would really like to be able to see is application developers
have the flexibiilty to choose their own configuration method and format, and
programatically build a configuration for the dpdk on a per-instance basis prior
to calling rte_eal_init

It seems like this approach satisfies that requirement
Neil

> /Bruce
>

[dpdk-dev] [PATCH 2/3] vhost: optimize dequeue for small packets

2016-06-03 Thread Xie, Huawei

On 5/3/2016 8:42 AM, Yuanhan Liu wrote:
> Both current kernel virtio driver and DPDK virtio driver use at least
> 2 desc buffer for Tx: the first for storing the header, and the others
> for storing the data.
>
> Therefore, we could fetch the first data desc buf before the main loop,
> and do the copy first before the check of "are we done yet?". This
> could save one check for small packets, that just have one data desc
> buffer and need one mbuf to store it.
>
> Signed-off-by: Yuanhan Liu 

Acked-by: Huawei Xie

[dpdk-dev] [PATCH 2/3] vhost: optimize dequeue for small packets

2016-06-03 Thread Xie, Huawei

On 6/1/2016 2:41 PM, Yuanhan Liu wrote:
> On Wed, Jun 01, 2016 at 06:24:18AM +, Xie, Huawei wrote:
>> On 5/3/2016 8:42 AM, Yuanhan Liu wrote:
>>> Both current kernel virtio driver and DPDK virtio driver use at least
>>> 2 desc buffer for Tx: the first for storing the header, and the others
>>> for storing the data.
>> Tx could prepend some space for virtio net header whenever possible, so
>> that it could use only one descriptor.
> In such case, it will work as well: it will goto the "else" code path
> then.

It works, but the problem is the statement.

>> Another thing is this doesn't reduce the check because you also add a check.
> Actually, yes, it does save check. Before this patch, we have:
>
>   while (not done yet) {
>  if (desc_is_drain)
>  ...;
>
>  if (mbuf_is_full)
>  ...;
>
>  COPY();
>   }
>
> Note that the "while" check will be done twice, therefore, it's 4
> checks in total. After this patch, it would be:
>
>   if (virtio_net_hdr_takes_one_desc)
>   ...
>
>   while (1) {
>   COPY();
>
>   if (desc_is_drain) {
>   break if done;
>   ...;
>   }
>
>   if (mbuf_is_full {
>   /* small packets will bypass this check */
>   ;
>   }
>   }
>
> So, for small packets, it takes 2 checks only, which actually saves
> 2 checks.
>
>   --yliu
>

For small packets, we have three checks totally. It worth we use space
to save checks, so would ack this patch, but note that there is one
assumption that rte_memcpy API could handle zero length copy.

[dpdk-dev] [PATCH v2] e1000: configure VLAN TPID

2016-06-03 Thread Wang, Xiao W

Hi,

> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Beilei Xing
> Sent: Friday, June 3, 2016 10:59 AM
> To: Lu, Wenzhuo 
> Cc: dev at dpdk.org; Xing, Beilei 
> Subject: [dpdk-dev] [PATCH v2] e1000: configure VLAN TPID
> 
> This patch enables configuring the ether types of both inner and outer VLANs.
> Note that outer TPID of single VLAN and inner TPID of double VLAN are read
> only.
> 
> Signed-off-by: Beilei Xing 
> ---
> v2 changes:
>  Modify return value. Cause inner tpid is not supported by single vlan,  
> return -
> ENOTSUP.
>  Add return value. If want to set inner TPID of double vlan or set outer  
> tpid of
> single vlan, return -ENOTSUP.
> 
Acked-by: Xiao Wang

[dpdk-dev] [PATCH v2] ixgbe: configure VLAN TPID

2016-06-03 Thread Lu, Wenzhuo

Hi,


> -Original Message-
> From: Xing, Beilei
> Sent: Friday, June 3, 2016 10:58 AM
> To: Lu, Wenzhuo
> Cc: dev at dpdk.org; Xing, Beilei
> Subject: [PATCH v2] ixgbe: configure VLAN TPID
> 
> The patch enables configuring the ether types of both inner and outer VLANs.
> 
> Signed-off-by: Beilei Xing 
Acked-by: Wenzhuo Lu

[dpdk-dev] [PATCH] ethdev: change comments of VLAN type

2016-06-03 Thread Xing, Beilei



> -Original Message-
> From: Lu, Wenzhuo
> Sent: Friday, June 3, 2016 10:53 AM
> To: Xing, Beilei ; Wu, Jingjing  intel.com>
> Cc: dev at dpdk.org; Xing, Beilei 
> Subject: RE: [dpdk-dev] [PATCH] ethdev: change comments of VLAN type
> 
> Hi Beilei,
> 
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Beilei Xing
> > Sent: Thursday, May 26, 2016 3:28 PM
> > To: Wu, Jingjing
> > Cc: dev at dpdk.org; Xing, Beilei
> > Subject: [dpdk-dev] [PATCH] ethdev: change comments of VLAN type
> >
> > If the packet carries a single VLAN header, it is treated as the outer 
> > header.
> > So change the comments of inner VLAN and outer VLAN.
> >
> > Signed-off-by: Beilei Xing 
> > ---
> >  doc/guides/rel_notes/release_16_07.rst | 3 +++
> >  lib/librte_ether/rte_ethdev.h  | 4 ++--
> >  2 files changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/doc/guides/rel_notes/release_16_07.rst
> > b/doc/guides/rel_notes/release_16_07.rst
> > index 30e78d4..29db86c 100644
> > --- a/doc/guides/rel_notes/release_16_07.rst
> > +++ b/doc/guides/rel_notes/release_16_07.rst
> > @@ -116,6 +116,9 @@ API Changes
> >ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
> >tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff.
> >
> > +* The comments of ``ETH_VLAN_TYPE_INNER`` and
> > ``ETH_VLAN_TYPE_OUTER``
> > +in
> > +  ``rte_vlan_type`` are changed.
> I think we need more explanation here. At least the info in the commit log.

Hi Wenzhuo,
Thanks for your comments, and plan to combine this patch with i40 driver 
modification, it should be better.
Beilei

[dpdk-dev] [PATCH] e1000: configure VLAN TPID

2016-06-03 Thread Xing, Beilei


> -Original Message-
> From: Wang, Xiao W
> Sent: Thursday, June 2, 2016 2:44 PM
> To: Xing, Beilei ; Lu, Wenzhuo
> 
> Cc: dev at dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] e1000: configure VLAN TPID
> 
> Hi Beilei,
> 
> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Beilei Xing
> > Sent: Thursday, April 21, 2016 4:56 PM
> > To: Lu, Wenzhuo 
> > Cc: dev at dpdk.org; Xing, Beilei 
> > Subject: [dpdk-dev] [PATCH] e1000: configure VLAN TPID
> >
> > This patch enables configuring the ether types of both inner and outer
> VLANs.
> > Note that TPID of single or inner VLAN is read only.
> >
> > Signed-off-by: Beilei Xing 
> > ---
> >  drivers/net/e1000/igb_ethdev.c | 34
> > +++---
> >  1 file changed, 31 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/e1000/igb_ethdev.c
> > b/drivers/net/e1000/igb_ethdev.c index e0053fe..c957fbe 100644
> > --- a/drivers/net/e1000/igb_ethdev.c
> > +++ b/drivers/net/e1000/igb_ethdev.c
> > @@ -86,6 +86,14 @@
> >  #define E1000_INCVALUE_82576 (16 << IGB_82576_TSYNC_SHIFT)
> >  #define E1000_TSAUXC_DISABLE_SYSTIME 0x8000
> >
> > +/* CTRL_EXT bit mask*/
> > +#define E1000_CTRL_EXT_EXT_VLAN  (1 << 26)
> > +
> > +/* VLAN Ether Type bit mask */
> > +#define E1000_VET_VET_EXT0x
> > +
> > +#define E1000_VET_VET_EXT_SHIFT  16
> > +
> >  static int  eth_igb_configure(struct rte_eth_dev *dev);  static int
> > eth_igb_start(struct rte_eth_dev *dev);  static void
> > eth_igb_stop(struct rte_eth_dev *dev); @@ -2242,13 +2250,33 @@
> > eth_igb_vlan_tpid_set(struct rte_eth_dev *dev,  {
> > struct e1000_hw *hw =
> > E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
> > -   uint32_t reg = ETHER_TYPE_VLAN;
> > +   uint32_t reg;
> > +   uint32_t qinq;
> > int ret = 0;
> >
> > +   qinq = E1000_READ_REG(hw, E1000_CTRL_EXT);
> > +   qinq &= E1000_CTRL_EXT_EXT_VLAN;
> > +
> > switch (vlan_type) {
> > case ETH_VLAN_TYPE_INNER:
> > -   reg |= (tpid << 16);
> > -   E1000_WRITE_REG(hw, E1000_VET, reg);
> > +   if (qinq)
> > +   PMD_DRV_LOG(WARNING,
> > +   "inner vlan ether type is read-only\n");
> 
> Add: ret = -ENOTSUP or something else, so that the programmer can handle
> it.
> The same for below.

Yes, I think -ENOTSUP should be more appropriate here.

> 
> > +   else {
> > +   PMD_DRV_LOG(ERR, "not set QinQ on yet\n");
> > +   ret = -EIO;
> > +   }
> > +   break;
> > +   case ETH_VLAN_TYPE_OUTER:
> > +   if (qinq) {
> > +   reg = E1000_READ_REG(hw, E1000_VET);
> > +   reg = (reg & (~E1000_VET_VET_EXT)) |
> > +   ((uint32_t)tpid <<
> E1000_VET_VET_EXT_SHIFT);
> > +   E1000_WRITE_REG(hw, E1000_VET, reg);
> > +   } else
> > +   PMD_DRV_LOG(WARNING,
> > +   "single vlan ether type is read-only\n");
> > +
> > break;
> > default:
> > ret = -EINVAL;
> > --
> > 2.5.0
> 
> Since both inner and outer tpid are considered in this patch, the comment in
> rte_ethdev.h "vlan_tpid_set;  /**< Outer VLAN TPID Setup. */" should be
> changed
> accordingly.

Hi Xiao,
Thanks for your review, I've sent a new patch.

> 
> Best Regards,
> Xiao

[dpdk-dev] [PATCH] ethdev: change comments of VLAN type

2016-06-03 Thread Lu, Wenzhuo

Hi Beilei,


> -Original Message-
> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Beilei Xing
> Sent: Thursday, May 26, 2016 3:28 PM
> To: Wu, Jingjing
> Cc: dev at dpdk.org; Xing, Beilei
> Subject: [dpdk-dev] [PATCH] ethdev: change comments of VLAN type
> 
> If the packet carries a single VLAN header, it is treated as the outer header.
> So change the comments of inner VLAN and outer VLAN.
> 
> Signed-off-by: Beilei Xing 
> ---
>  doc/guides/rel_notes/release_16_07.rst | 3 +++
>  lib/librte_ether/rte_ethdev.h  | 4 ++--
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/doc/guides/rel_notes/release_16_07.rst
> b/doc/guides/rel_notes/release_16_07.rst
> index 30e78d4..29db86c 100644
> --- a/doc/guides/rel_notes/release_16_07.rst
> +++ b/doc/guides/rel_notes/release_16_07.rst
> @@ -116,6 +116,9 @@ API Changes
>ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss,
>tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff.
> 
> +* The comments of ``ETH_VLAN_TYPE_INNER`` and
> ``ETH_VLAN_TYPE_OUTER``
> +in
> +  ``rte_vlan_type`` are changed.
I think we need more explanation here. At least the info in the commit log.

[dpdk-dev] [PATCH v2] mempool: replace c memcpy code semantics with optimized rte_memcpy

2016-06-03 Thread Olivier MATZ

Hi Jerin,

On 06/02/2016 11:39 AM, Jerin Jacob wrote:
> On Thu, Jun 02, 2016 at 09:36:34AM +0200, Olivier MATZ wrote:
>> I think the LIFO behavior should occur on a per-bulk basis. I mean,
>> it should behave like in the exemplaes below:
>>
>>   // pool cache is in state X
>>   obj1 = mempool_get(mp)
>>   obj2 = mempool_get(mp)
>>   mempool_put(mp, obj2)
>>   mempool_put(mp, obj1)
>>   // pool cache is back in state X
>>
>>   // pool cache is in state X
>>   bulk1 = mempool_get_bulk(mp, 16)
>>   bulk2 = mempool_get_bulk(mp, 16)
>>   mempool_put_bulk(mp, bulk2, 16)
>>   mempool_put_bulk(mp, bulk1, 16)
>>   // pool cache is back in state X
>>
> 
> Per entry LIFO behavior make more sense in _bulk_ case as recently en-queued 
> buffer
> comes out for next "get" makes more chance that buffer in Last level cache.

Yes, from a memory cache perspective, I think you are right.

In practise, I'm not sure it's so important because on many hw drivers,
a packet buffer returns to the pool only after a round of the tx ring.
So I'd say it won't make a big difference here.

>> Note that today it's not the case for bulks, since object addresses
>> are reversed only in get(), we are not back in the original state.
>> I don't really see the advantage of this.
>>
>> Removing the reversing may accelerate the cache in case of bulk get,
>> I think.
> 
> I tried in my setup, it was dropping the performance. Have you got
> improvement in any setup?

I know that the mempool_perf autotest is not representative
of a real use case, but it gives a trend. I did a quick test with
- the legacy code,
- the rte_memcpy in put()
- the rte_memcpy in both put() and get() (no reverse)
It seems that removing the reversing brings ~50% of enhancement
with bulks of 32 (on an westmere):

legacy
mempool_autotest cache=512 cores=1 n_get_bulk=32 n_put_bulk=32 n_keep=32
rate_persec=839922483
mempool_autotest cache=512 cores=1 n_get_bulk=32 n_put_bulk=32
n_keep=128 rate_persec=849792204
mempool_autotest cache=512 cores=2 n_get_bulk=32 n_put_bulk=32 n_keep=32
rate_persec=1617022156
mempool_autotest cache=512 cores=2 n_get_bulk=32 n_put_bulk=32
n_keep=128 rate_persec=1675087052
mempool_autotest cache=512 cores=4 n_get_bulk=32 n_put_bulk=32 n_keep=32
rate_persec=3202914713
mempool_autotest cache=512 cores=4 n_get_bulk=32 n_put_bulk=32
n_keep=128 rate_persec=3268725963

rte_memcpy in put() (your patch proposal)
mempool_autotest cache=512 cores=1 n_get_bulk=32 n_put_bulk=32 n_keep=32
rate_persec=762157465
mempool_autotest cache=512 cores=1 n_get_bulk=32 n_put_bulk=32
n_keep=128 rate_persec=744593817
mempool_autotest cache=512 cores=2 n_get_bulk=32 n_put_bulk=32 n_keep=32
rate_persec=1500276326
mempool_autotest cache=512 cores=2 n_get_bulk=32 n_put_bulk=32
n_keep=128 rate_persec=1461347942
mempool_autotest cache=512 cores=4 n_get_bulk=32 n_put_bulk=32 n_keep=32
rate_persec=2974076107
mempool_autotest cache=512 cores=4 n_get_bulk=32 n_put_bulk=32
n_keep=128 rate_persec=2928122264

rte_memcpy in put() and get()
mempool_autotest cache=512 cores=1 n_get_bulk=32 n_put_bulk=32 n_keep=32
rate_persec=974834892
mempool_autotest cache=512 cores=1 n_get_bulk=32 n_put_bulk=32
n_keep=128 rate_persec=1129329459
mempool_autotest cache=512 cores=2 n_get_bulk=32 n_put_bulk=32 n_keep=32
rate_persec=2147798220
mempool_autotest cache=512 cores=2 n_get_bulk=32 n_put_bulk=32
n_keep=128 rate_persec=2232457625
mempool_autotest cache=512 cores=4 n_get_bulk=32 n_put_bulk=32 n_keep=32
rate_persec=4510816664
mempool_autotest cache=512 cores=4 n_get_bulk=32 n_put_bulk=32
n_keep=128 rate_persec=4582421298

This is probably more a measure of the pure CPU cost of the mempool
function, without considering the memory cache aspect. So, of course,
a real use-case test should be done to confirm or not that it increases
the performance. I'll manage to do a test and let you know the result.

By the way, not all drivers are allocating or freeing the mbufs by
bulk, so this modification would only affect these ones. What driver
are you using for your test?


Regards,
Olivier

78 matches

Mail list logo