> On Apr 11, 2019, at 1:12 PM, Yongseok Koh <ys...@mellanox.com> wrote: > >> >> On Apr 10, 2019, at 11:07 PM, Pavan Nikhilesh Bhagavatula >> <pbhagavat...@marvell.com> wrote: >> >> Hi Yongseok, >> >>> -----Original Message----- >>> From: Yongseok Koh <ys...@mellanox.com> >>> Sent: Wednesday, April 10, 2019 11:08 PM >>> To: Pavan Nikhilesh Bhagavatula <pbhagavat...@marvell.com> >>> Cc: Thomas Monjalon <tho...@monjalon.net>; dev <dev@dpdk.org>; Jerin >>> Jacob Kollanukkaran <jer...@marvell.com>; jerinjac...@gmail.com >>> Subject: [EXT] Re: [dpdk-dev] [PATCH v8 2/4] meson: add infra to support >>> machine specific flags >>> >>> External Email >>> >>> ---------------------------------------------------------------------- >>> >>>> On Apr 10, 2019, at 9:13 AM, jerinjac...@gmail.com wrote: >>>> >>>> From: Pavan Nikhilesh <pbhagavat...@marvell.com> >>>> >>>> Currently, RTE_* flags are set based on the implementer ID but there >>>> might be some micro arch specific differences from the same vendor eg. >>>> CACHE_LINESIZE. Add support to set micro arch specific flags. >>>> >>>> Signed-off-by: Pavan Nikhilesh <pbhagavat...@marvell.com> >>>> Signed-off-by: Jerin Jacob <jer...@marvell.com> >>>> --- >>>> config/arm/meson.build | 56 ++++++++++++++++++++++++------------------ >>>> 1 file changed, 32 insertions(+), 24 deletions(-) >>>> >>>> diff --git a/config/arm/meson.build b/config/arm/meson.build index >>>> 170a4981a..24bce2b39 100644 >>>> --- a/config/arm/meson.build >>>> +++ b/config/arm/meson.build >>>> @@ -7,25 +7,6 @@ march_opt = '-march=@0@'.format(machine) >>>> >>>> arm_force_native_march = false >>>> >>>> -machine_args_generic = [ >>>> - ['default', ['-march=armv8-a+crc+crypto']], >>>> - ['native', ['-march=native']], >>>> - ['0xd03', ['-mcpu=cortex-a53']], >>>> - ['0xd04', ['-mcpu=cortex-a35']], >>>> - ['0xd05', ['-mcpu=cortex-a55']], >>>> - ['0xd07', ['-mcpu=cortex-a57']], >>>> - ['0xd08', ['-mcpu=cortex-a72']], >>>> - ['0xd09', ['-mcpu=cortex-a73']], >>>> - ['0xd0a', ['-mcpu=cortex-a75']], >>>> - ['0xd0b', ['-mcpu=cortex-a76']], >>>> -] >>>> -machine_args_cavium = [ >>>> - ['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']], >>>> - ['native', ['-march=native']], >>>> - ['0xa1', ['-mcpu=thunderxt88']], >>>> - ['0xa2', ['-mcpu=thunderxt81']], >>>> - ['0xa3', ['-mcpu=thunderxt83']]] >>>> - >>>> flags_common_default = [ >>>> # Accelarate rte_memcpy. Be sure to run unit test >>> (memcpy_perf_autotest) >>>> # to determine the best threshold in code. Refer to notes in source >>>> file @@ -52,12 +33,10 @@ flags_generic = [ >>>> ['RTE_USE_C11_MEM_MODEL', true], >>>> ['RTE_CACHE_LINE_SIZE', 128]] >>>> flags_cavium = [ >>>> - ['RTE_MACHINE', '"thunderx"'], >>>> ['RTE_CACHE_LINE_SIZE', 128], >>>> ['RTE_MAX_NUMA_NODES', 2], >>>> ['RTE_MAX_LCORE', 96], >>>> - ['RTE_MAX_VFIO_GROUPS', 128], >>>> - ['RTE_USE_C11_MEM_MODEL', false]] >>>> + ['RTE_MAX_VFIO_GROUPS', 128]] >>>> flags_dpaa = [ >>>> ['RTE_MACHINE', '"dpaa"'], >>>> ['RTE_USE_C11_MEM_MODEL', true], >>>> @@ -71,6 +50,27 @@ flags_dpaa2 = [ >>>> ['RTE_MAX_NUMA_NODES', 1], >>>> ['RTE_MAX_LCORE', 16], >>>> ['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false]] >>>> +flags_default_extra = [] >>>> +flags_thunderx_extra = [ >>>> + ['RTE_MACHINE', '"thunderx"'], >>>> + ['RTE_USE_C11_MEM_MODEL', false]] >>>> + >>>> +machine_args_generic = [ >>>> + ['default', ['-march=armv8-a+crc+crypto']], >>>> + ['native', ['-march=native']], >>>> + ['0xd03', ['-mcpu=cortex-a53']], >>>> + ['0xd04', ['-mcpu=cortex-a35']], >>>> + ['0xd07', ['-mcpu=cortex-a57']], >>>> + ['0xd08', ['-mcpu=cortex-a72']], >>>> + ['0xd09', ['-mcpu=cortex-a73']], >>>> + ['0xd0a', ['-mcpu=cortex-a75']]] >>>> + >>>> +machine_args_cavium = [ >>>> + ['default', ['-march=armv8-a+crc+crypto','-mcpu=thunderx']], >>>> + ['native', ['-march=native']], >>>> + ['0xa1', ['-mcpu=thunderxt88'], flags_thunderx_extra], >>>> + ['0xa2', ['-mcpu=thunderxt81'], flags_thunderx_extra], >>>> + ['0xa3', ['-mcpu=thunderxt83'], flags_thunderx_extra]] >>>> >>>> ## Arm implementer ID (ARM DDI 0487C.a, Section G7.2.106, Page >>>> G7-5321) impl_generic = ['Generic armv8', flags_generic, >>>> machine_args_generic] @@ -157,8 +157,16 @@ else >>>> endif >>>> foreach marg: machine[2] >>>> if marg[0] == impl_pn >>>> - foreach f: marg[1] >>>> - machine_args += f >>>> + foreach flag: marg[1] >>>> + if cc.has_argument(flag) >>>> + machine_args += flag >>>> + endif >>>> + endforeach >>>> + # Apply any extra machine specific flags. >>>> + foreach flag: marg.get(2, flags_default_extra) >>>> + if flag.length() > 0 >>>> + dpdk_conf.set(flag[0], flag[1]) >>>> + endif >>> >>> Let me continue the discussion from v7 here. >>> Seems I wan't clear enough. >>> >>> Let me take an example. If the host is thunderx2 (0xaf) and compiler is >>> older >>> than v7, flags_thunderx2_extra isn't set. This means, for example, >>> RTE_CACHE_LINE_SIZE will still be 128. Is that what you want? >>> RTE_CACHE_LINE_SIZE has nothing to do with compiler support and you might >>> want to set it regardless of gcc version. You could skip setting -mcpu with >>> setting >>> the extra flags. >>> >> >> Thanks for the detailed explanation. >> I think since we have the check to skip mcpu flag when cc doesn't support it >> (cc.has_argument(flag)) >> It will be safe to remove >> ` >> # Primary part number based mcpu flags are supported >> # for gcc versions > 7 >> if cc.version().version_compare( >> '<7.0') or cmd_output.length() == 0 >> if not meson.is_cross_build() and arm_force_native_march == >> true >> impl_pn = 'native' >> else >> impl_pn = 'default' >> endif >> endif >> ` > > +1
I've tested it but still have an issue with old gcc. Even if -mcpu isn't set due to cc.has_argument(), -march isn't set either. So, it spews error due to lack of CRC feature. -march should have '+crc'. The error I got was: > ninja: Entering directory `build' > [942/1452] Compiling C object > 'drivers/drivers...c@sta/net_softnic_rte_eth_softnic_action.c.o'. > FAILED: > drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic_action.c.o > cc -Idrivers/drivers@@tmp_rte_pmd_softnic@sta -Idrivers -I../drivers > -Idrivers/net/softnic -I../drivers/net/softnic -Ilib/librte_ethdev > -I../lib/librte_ethdev -I. -I../ -Iconfig > -I../config-Ilib/librte_eal/common/include -I../lib/librte_eal/common/include > -I../lib/librte_eal/linux/eal/include -Ilib/librte_eal/common > -I../lib/librte_eal/common -Ilib/librte_eal/ > common/include/arch/arm -I../lib/librte_eal/common/include/arch/arm > -Ilib/librte_eal -I../lib/librte_eal -Ilib/librte_kvargs > -I../lib/librte_kvargs -Ilib/librte_net -I../lib/librte_net -Ilib/librte_mbuf > -I../lib/librte_mbuf -Ilib/librte_mempool -I../lib/librte_mempool > -Ilib/librte_ring -I../lib/librte_ring -Ilib/librte_cmdline > -I../lib/librte_cmdline -Ilib/lib > rte_meter -I../lib/librte_meter -Idrivers/bus/pci -I../drivers/bus/pci > -I../drivers/bus/pci/linux -Ilib/librte_pci -I../lib/librte_pci > -Idrivers/bus/vdev -I../drivers/bus/vdev -Ilib/librte_pipeline > -I../lib/librte_pipeline -Ilib/librte_port -I../lib/librte_port > -Ilib/librte_sched -I../lib/librte_sched -Ilib/librte_ip_frag > -I../lib/librte_ip_frag -Ilib/librte_h > ash -I../lib/librte_hash -Ilib/librte_cryptodev -I../lib/librte_cryptodev > -Ilib/librte_kni -I../lib/librte_kni -Ilib/librte_table -I../lib/librte_table > -Ilib/librte_lpm -I../lib/librte_lpm -Ilib/librte_acl -I../lib/librte_acl > -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -O3 -include rte_config.h > -Wsign-compare -Wcast-qual -fPIC -D_GNU_SOURCE -DALLOW_EXPERI > MENTAL_API -MD -MQ > 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic_action.c.o' > -MF > 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic_action.c.o.d' > -o > 'drivers/drivers@@tmp_rte_pmd_softnic@sta/net_softnic_rte_eth_softnic_action.c.o' > -c ../drivers/net/softnic/rte_eth_softnic_action.c > {standard input}: Assembler messages: > {standard input}:14: Error: selected processor does not support `crc32cx > w3,w3,x0' > {standard input}:37: Error: selected processor does not support `crc32cx > w1,w1,x3' > {standard input}:40: Error: selected processor does not support `crc32cx > w0,w0,x2' My machine has 0x41(Arm) and 0xd08(cortex-a72). gcc is '4.8.5 20150623 (Red Hat 4.8.5-28)' Thanks, Yongseok > >> >> The command output check can also be removed as it is handled when calling >> the command script itself. > > +1 > >> >> Thoughts? >> >> PS. I think the safest way to set CACHELINE_SIZE is to read the cache type >> register[1] but sadly only few latest kernels >> have the support through sysfs >> (/sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size) > > +1 > > In summary, +3. LoL > > I'll also submit a patch to change the default cacheline size of cortex-a72 > with the new flags_*_extra[] > > > thanks, > Yongseok