On 30 Jun 2021, at 11:32, Van Haaren, Harry wrote:

>> -----Original Message-----
>> From: Eelco Chaudron <echau...@redhat.com>
>> Sent: Wednesday, June 30, 2021 10:18 AM
>> To: Van Haaren, Harry <harry.van.haa...@intel.com>
>> Cc: Amber, Kumar <kumar.am...@intel.com>; d...@openvswitch.org;
>> i.maxim...@ovn.org
>> Subject: Re: [ovs-dev] [v4 03/12] dpif-netdev: Add study function to select 
>> the best
>> mfex function
>>
>>
>>
>> On 29 Jun 2021, at 18:32, Van Haaren, Harry wrote:
>>
>>>> -----Original Message-----
>>>> From: dev <ovs-dev-boun...@openvswitch.org> On Behalf Of Eelco Chaudron
>>>> Sent: Tuesday, June 29, 2021 1:38 PM
>>>> To: Amber, Kumar <kumar.am...@intel.com>
>>>> Cc: d...@openvswitch.org; i.maxim...@ovn.org
>>>> Subject: Re: [ovs-dev] [v4 03/12] dpif-netdev: Add study function to 
>>>> select the
>> best
>>>> mfex function
>
> <snip previous discussion>
>
>>> Perfect is the enemy of good... I'd prefer focus on getting existing code 
>>> changes
>> merged,
>>> and add additional (optional) parameters in future if deemed useful in real 
>>> world
>> testing?
>>
>> See Flavio’s reply, as those were the concerns same concerns I thought of.
>
> Yes - thanks for combining threads - I'm writing a detailed reply there as we 
> speak here :)
> I'll send that reply shortly.
>
> <snip code/patch changes>
>
>>>>> +        if (max_hits >= MFEX_MIN_HIT_COUNT_FOR_USE) {
>>>>> +            /* Set the implementation to index with max_hits. */
>>>>> +            pmd->miniflow_extract_opt =
>>>>> +                        miniflow_funcs[best_func_index].extract_func;
>>>>> +            VLOG_INFO("MFEX study chose impl %s: (hits %d/%d pkts)\n",
>>>>> +                      miniflow_funcs[best_func_index].name, max_hits,
>>>>> +                      stats->pkt_count);
>>>>
>>>> We have no idea which PMD the mode is selected for guess we might need to 
>>>> add
>>>> this?
>>>>
>>>> Maybe we should report the numbers/hits for the other methods, as they 
>>>> might
>> be
>>>> equal, and some might be faster in execution time?
>>>
>>> As above, the implementations are sorted in performance order. Performance
>>> here can be known by micro-benchmarks, and developers of such SIMD optimized
>>> code can be expected to know which impl is fastest.
>>
>> Don’t think we can, as it’s not documented in the code, and some one can 
>> just add
>> his own, and has no clue about the existing ones.
>
> Yes, in theory somebody could add his own, and get this wrong. There are many 
> many
> things that could go wrong when making code changes. We cannot document 
> everything.

I meant that the code currently does not document that the implementation 
table, mfex_impls[], is in order of preference. So I think this should be added.

>>> In our current code, the avx512_vbmi_* impls are always before the avx512_*
>>> impls, as the VBMI instruction set allows a faster runtime.
>>
>> Guess we need some documentation in the developer's section on how to add
>> processor optimized functions, and how to benchmark them (and maybe some
>> benchmark data for the current implementations).
>> Also, someone can write a sloppy avx512_vbmi* function that might be slower 
>> than
>> an avx512_*, right?
>
> What are we trying to achieve here? What is the root problem that is being 
> addressed?
>
> Yes, somebody "could" write sloppy (complex, advanced, ISA specific, SIMD) 
> avx512 code,
> and have it be slower. Who is realistically going to do that?
>
> I'm fine with documenting a few things if they make sense to document, but
> trying to "hand hold" at every level just doesn't work. Adding sections on how
> to benchmark code, and how function pointers work and how to add them?
> These things are documented in various places across the internet.
>
> If there's really an interest to learn AVX512 SIMD optimization, reach out to 
> the
> OVS community, put me on CC, and I'll be willing to help. Adding documentation
> ad nauseam is not the solution, as each optimization is likely to have subtle 
> differences.
I think the problem is that except you, and some other small group at Intel 
might know AVX512, but for most of the OVS community this is moving back to 
handwritten assembler. So at least some guidelines on what you should do when 
adding a custom function would help. Like order them in priority, maybe some 
simple example on how to benchmark the runtime of the mfex function. Don't 
think this has to be part of this patch, but a follow-up would be nice.
>
>
>>> <snip code changes till end of patch>
> <snip snip away irrelevant old discussions>

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to