On 30 Jun 2021, at 15:30, Van Haaren, Harry wrote:
>> -----Original Message----- >> From: Eelco Chaudron <echau...@redhat.com> >> Sent: Wednesday, June 30, 2021 2:12 PM >> To: Amber, Kumar <kumar.am...@intel.com>; Van Haaren, Harry >> <harry.van.haa...@intel.com> >> Cc: d...@openvswitch.org; i.maxim...@ovn.org; Flavio Leitner >> <f...@sysclose.org>; >> Stokes, Ian <ian.sto...@intel.com> >> Subject: Re: [ovs-dev] [v4 10/12] dpif-netdev/mfex: Add AVX512 based >> optimized >> miniflow extract >> >> This patch was an interesting patch to review and being reminded about >> endianness, >> and this site, >> https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mm512_maskz >> _permutexvar_epi8&expand=4315, got me through it ;) > > Hah, yes the Intrinsics Guide is very useful for reading/investigating > what/how instructions can do. > Its... almost always open in a browser in some tab here! :) > > >> Some comments below... >> >> //Eelco > > Thanks for review, I'll snip away large chunks of code to reduce verbosity. > > Regards, -Harry > > >> On 17 Jun 2021, at 18:27, Kumar Amber wrote: >> >>> From: Harry van Haaren <harry.van.haa...@intel.com> > > <snip> > >>> +/* AVX512-BW level permutex2var_epi8 emulation. */ >>> +static inline __m512i >>> +__attribute__((target("avx512bw"))) >> >> Are these targets universal enough for all supported compilers, if not we >> might need >> to move them to individual macros in compile.h. > > Yes, these are the standard gcc/clang etc compiler -m <isa level> switches. > > Search for "-mavx512bw" on e.g. this GCC page, lists them all; > https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html > > If a compiler does not understand them, we will have to #ifdef that compiler > out, > as it just doesn't support the ISA. Guess my concern is with the windows/Microsoft compiler, as I have no windows setup, I can not verify this. >>> +/* Static const instances of profiles. These are compile-time constants, >>> + * and are specialized into individual miniflow-extract functions. >>> + */ >>> +static const struct mfex_profile mfex_profiles[PROFILE_COUNT] = >>> +{ >>> + [PROFILE_ETH_IPV4_UDP] = { >>> + .probe_mask.u8_data = { PATTERN_ETHERTYPE_MASK PATTERN_IPV4_MASK >> }, >>> + .probe_data.u8_data = { PATTERN_ETHERTYPE_IPV4 PATTERN_IPV4_UDP}, >>> + >>> + .store_shuf.u8_data = { PATTERN_IPV4_UDP_SHUFFLE }, >>> + .store_kmsk = PATTERN_IPV4_UDP_KMASK, >>> + >>> + .mf_bits = { 0x18a0000000000000, 0x0000000000040401}, >> >> I did some manual translation from these bits, to parts of the flow >> structure they >> represent, but it was not something fun to do. Maybe you still have your >> notes and >> could add some to the code? It might help debugging? > > Agree that these bits are "arbitrary" to some degree, they're offsets into > the miniflow > datastructure, with each bit representing 8-bytes of data. > > These are derived from the output of the autovalidator, which prints "good" > and "test" > values. Nice forgot about that part ;) > <snip> > >> As we are explicitly manual defining the mf_bits I think we also need to >> update the >> comment in the “struct flow” definition to reflect that if the order change >> these >> specific functions need updating also. > > There's an "ABI Macro" in that struct, we can throw one of those build-time > asserts into here > too to be "extra sure", but this would be caught by running MFEX > autovalidation unit tests. Guess they will but not sure if the dpdk test is part of the standard tests. Anyway, this is the comment I think should be updated: https://github.com/openvswitch/ovs/blob/e5b5008acdf08e90874f5b4da09ffe162fc762aa/include/openvswitch/flow.h#L97 > <snip> > >>> +/* Generic loop to process any mfex profile. This code is specialized into >>> + * multiple actual MFEX implementation functions. Its marked ALWAYS_INLINE >>> + * to ensure the compiler specializes each instance. The code is marked >>> "hot" >>> + * to inform the compiler this is a hotspot in the program, encouraging >>> + * inlining of callee functions such as the permute calls. >>> + */ >>> +static inline uint32_t ALWAYS_INLINE >>> +__attribute__ ((hot)) >> >> Do we need to move this to a macro in compiler.h as OVS_HOT to make sure >> it’s not >> causing issues on other compilers like windows, etc? > > I'm not sure, we could I suppose, I'm not strongly for or against. Today this > patchset doesn't modify compiler.h at all, perhaps cleaner to update in a > later patch, > and consider other functions for tagging with OVS_HOT too in that patchset? > > <snip> I do not have a strong preference either. It looks like this is the only patch/place using it, and as you suggested, we could do it in a follow-up patch if we start using it in more places. >>> + /* Copy known dp packet offsets to the dp_packet instance. */ >>> + memcpy(&packet->l2_pad_size, &profile->dp_pkt_offs, >>> + sizeof(uint16_t) * 4); >>> + >> >> Here we copy four fields to the packet structure (l2_pad_size, l2_5_ofs, >> l3_ofs, >> l4_ofs). I think we should add some static_assert to make sure the order of >> these >> fields do not change. > > Yes, I think Flavio had a similar comment in one of the reviews. Good point, > has been addressed with BUILD_ASSERT_DELC() and offsets into struct by Amber. ACK. > <snip to end> _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev