Yes, that makes sense, will have a try and keep you posted.

Pan

-----Original Message-----
From: Kito Cheng <kito.ch...@gmail.com> 
Sent: Saturday, May 6, 2023 10:19 AM
To: Li, Pan2 <pan2...@intel.com>
Cc: juzhe.zh...@rivai.ai; rguenther <rguent...@suse.de>; richard.sandiford 
<richard.sandif...@arm.com>; jeffreyalaw <jeffreya...@gmail.com>; gcc-patches 
<gcc-patches@gcc.gnu.org>; palmer <pal...@dabbelt.com>; jakub <ja...@redhat.com>
Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 
16-bit

I think x86 first? The major thing we want to make sure is that this change 
won't affect those targets which do not really require 16 bit machine_mode too 
much.


On Sat, May 6, 2023 at 10:12 AM Li, Pan2 via Gcc-patches 
<gcc-patches@gcc.gnu.org> wrote:
>
> Sure thing, I will pick them all together and trigger(will send out the 
> overall diff before start to make sure my understand is correct) the test 
> again. BTW which target do we prefer first? X86 or RISC-V.
>
> Pan
>
> From: juzhe.zh...@rivai.ai <juzhe.zh...@rivai.ai>
> Sent: Saturday, May 6, 2023 10:00 AM
> To: kito.cheng <kito.ch...@gmail.com>; Li, Pan2 <pan2...@intel.com>
> Cc: rguenther <rguent...@suse.de>; richard.sandiford 
> <richard.sandif...@arm.com>; jeffreyalaw <jeffreya...@gmail.com>; 
> gcc-patches <gcc-patches@gcc.gnu.org>; palmer <pal...@dabbelt.com>; 
> jakub <ja...@redhat.com>
> Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from 
> 8-bit to 16-bit
>
> Yeah, you should also swap mode and code in rtx_def according to 
> Richard suggestion since it will not change the rtx_def data structure.
>
> I think the only problem is the mode in tree data structure.
> ________________________________
> juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>
>
> From: Kito Cheng<mailto:kito.ch...@gmail.com>
> Date: 2023-05-06 09:53
> To: Li, Pan2<mailto:pan2...@intel.com>
> CC: Richard Biener<mailto:rguent...@suse.de>; 
> 钟居哲<mailto:juzhe.zh...@rivai.ai>; 
> richard.sandiford<mailto:richard.sandif...@arm.com>; Jeff 
> Law<mailto:jeffreya...@gmail.com>; 
> gcc-patches<mailto:gcc-patches@gcc.gnu.org>; 
> palmer<mailto:pal...@dabbelt.com>; jakub<mailto:ja...@redhat.com>
> Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from 
> 8-bit to 16-bit Hi Pan:
>
> Could you try to apply the following diff and measure again? This 
> makes tree_type_common size unchanged.
>
>
> sizeof tree_type_common= 128 (mode = 8 bit) sizeof tree_type_common= 
> 136 (mode = 16 bit) sizeof tree_type_common= 128 (mode = 8 bit w/ this 
> diff)
>
> diff --git a/gcc/tree-core.h b/gcc/tree-core.h index 
> af795aa81f98..b8ccfa407ed9 100644
> --- a/gcc/tree-core.h
> +++ b/gcc/tree-core.h
> @@ -1680,6 +1680,8 @@ struct GTY(()) tree_type_common {
>   tree attributes;
>   unsigned int uid;
>
> +  ENUM_BITFIELD(machine_mode) mode : 16;
> +
>   unsigned int precision : 10;
>   unsigned no_force_blk_flag : 1;
>   unsigned needs_constructing_flag : 1; @@ -1687,7 +1689,6 @@ struct 
> GTY(()) tree_type_common {
>   unsigned restrict_flag : 1;
>   unsigned contains_placeholder_bits : 2;
>
> -  ENUM_BITFIELD(machine_mode) mode : 16;
>
>   /* TYPE_STRING_FLAG for INTEGER_TYPE and ARRAY_TYPE.
>      TYPE_CXX_ODR_P for RECORD_TYPE and UNION_TYPE.  */ @@ -1712,7 
> +1713,7 @@ struct GTY(()) tree_type_common {
>   unsigned empty_flag : 1;
>   unsigned indivisible_p : 1;
>   unsigned no_named_args_stdarg_p : 1;
> -  unsigned spare : 15;
> +  unsigned spare : 7;
>
>   alias_set_type alias_set;
>   tree pointer_to;
>
> On Sat, May 6, 2023 at 9:10 AM Li, Pan2 via Gcc-patches 
> <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>> wrote:
> >
> > Yes, totally agree the number cannot be very accurate up to a point. Update 
> > the correlated memory bytes allocated for the X86 target.
> >
> > Bytes allocated with O2:
> > -----------------------------------------------------------------------------------------------------
> > Benchmark               |  upstream             | with this PATCH
> > -----------------------------------------------------------------------------------------------------
> > 400.perlbench           | 25286185160           | 25176544846 ~0.0%
> > 401.bzip2               | 1429883731            | 1391040027 -2.7%
> > 403.gcc                 | 55023568981           | 54798890746 ~0.0%
> > 429.mcf         | 1360975660            | 1321537710 -2.9%
> > 445.gobmk               | 12791636502           | 12666523431 -1.0%
> > 456.hmmer               | 9354433652            | 9279189174 ~0.0%
> > 458.sjeng               | 1991260562            | 1944031904 -2.4%
> > 462.libquantum          | 1725112078            | 1684213981 -2.4%
> > 464.h264ref             | 8597673515            | 8528855778 ~0.0%
> > 471.omnetpp             | 37613034778           | 37432278047 ~0.0%
> > 473.astar               | 3817295518            | 3772460508 -1.2%
> > 483.xalancbmk           | 149418776991  | 148545162207 ~0.0%
> >
> > Bytes allocated with Ofast + funroll-loops:
> > ------------------------------------------------------------------------------------------
> > Benchmark               |  upstream             | with this PATCH
> > ------------------------------------------------------------------------------------------
> > 400.perlbench           | 30438407499           | 30574152897 ~0.0%
> > 401.bzip2               | 2277114519            | 2319432664 +1.9%
> > 403.gcc                 | 64499664264           | 64781232731 ~0.0%
> > 429.mcf         | 1361486758            | 1399942116 +2.8%
> > 445.gobmk               | 15258056111           | 15396801542 +1.0%
> > 456.hmmer               | 10896615649           | 10936223486 ~0.0%
> > 458.sjeng               | 2592620709            | 2641687496 +1.9%
> > 462.libquantum          | 1814487525            | 1854518500 +2.2%
> > 464.h264ref             | 13528736878           | 13614517066 ~0.0%
> > 471.omnetpp             | 38721066702           | 38910524667 ~0.0%
> > 473.astar               | 3924015756            | 3968057027 +1.1%
> > 483.xalancbmk           | 165897692838  | 166843885880 ~0.0%
> >
> > Pan
> >
> >
> > -----Original Message-----
> > From: Richard Biener <rguent...@suse.de<mailto:rguent...@suse.de>>
> > Sent: Friday, May 5, 2023 2:25 PM
> > To: Li, Pan2 <pan2...@intel.com<mailto:pan2...@intel.com>>
> > Cc: 钟居哲 <juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>>; 
> > kito.cheng <kito.ch...@gmail.com<mailto:kito.ch...@gmail.com>>; 
> > richard.sandiford 
> > <richard.sandif...@arm.com<mailto:richard.sandif...@arm.com>>; Jeff 
> > Law <jeffreya...@gmail.com<mailto:jeffreya...@gmail.com>>; 
> > gcc-patches 
> > <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>>; palmer 
> > <pal...@dabbelt.com<mailto:pal...@dabbelt.com>>; jakub 
> > <ja...@redhat.com<mailto:ja...@redhat.com>>
> > Subject: RE: Re: [PATCH] machine_mode type size: Extend enum size 
> > from 8-bit to 16-bit
> >
> > On Fri, 5 May 2023, Li, Pan2 wrote:
> >
> > > I tried the memory profiling by valgrind --tool=memcheck 
> > > --trace-children=yes for this change, target the SPEC 2006 INT part with 
> > > rv64gcv. Note we only count the bytes allocated from valgrind log like 
> > > this "==2832896==   total heap usage: 208 allocs, 165 frees, 123,204 
> > > bytes allocated".
> > >
> > > Consider some variance of valgrind, it looks like the impact to 
> > > bytes allocated may be limited. However, I am still running this 
> > > for x86, it will take more than 30 hours for each iteration...
> >
> > I'm not sure I'd call +- 7% on memory use "limited" - but I fear the 
> > numbers are off.  Note since various structures reside in GC memory there's 
> > also changes to GC overhead and fragmentation, so precise measurements are 
> > difficult.
> >
> > Richard.
> >
> > > RISC-V GCC Version:
> > > >> ~/bin/test-gnu-8-bits/bin/riscv64-unknown-linux-gnu-gcc 
> > > >> --version
> > > riscv64-unknown-linux-gnu-gcc (gd7cb9720ed5) 14.0.0 20230503
> > > (experimental) Copyright (C) 2023 Free Software Foundation, Inc.
> > > This is free software; see the source for copying conditions.  
> > > There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A 
> > > PARTICULAR PURPOSE.
> > >
> > > Bytes allocated with O2:
> > > -----------------------------------------------------------------------------------------------------
> > > Benchmark             |  upstream             | with this PATCH
> > > -----------------------------------------------------------------------------------------------------
> > > 400.perlbench         | 29699642875           | 29949876269 ~0.0%
> > > 401.bzip2             | 1641041659            | 1755563972 +6.95%
> > > 403.gcc                       | 68447500516           | 68900883291 ~0.0%
> > > 429.mcf               | 1433156462            | 1433253373 ~0.0%
> > > 445.gobmk             | 14239225210           | 14463438465 ~0.0%
> > > 456.hmmer             | 9635955623            | 9808534948 +1.8%
> > > 458.sjeng             | 2419478204            | 2545478940 +5.4%
> > > 462.libquantum                | 1686404489            | 1800884197 +6.8%
> > > 464.h264ref   8j1     | 10190413900           | 10351134161 +1.6%
> > > 471.omnetpp           | 40814627684           | 41185864529 ~0.0%
> > > 473.astar             | 3807097529            | 3928428183 +3.2%
> > > 483.xalancbmk         | 152959418167  | 154201738843 ~0.0%
> > >
> > > Bytes allocated with Ofast + funroll-loops:
> > > ------------------------------------------------------------------------------------------
> > > Benchmark             |  upstream             | with this PATCH
> > > ------------------------------------------------------------------------------------------
> > > 400.perlbench         |  39491184733          | 39223020267 ~0.0%
> > > 401.bzip2             |  2843871517           | 2730383463 ~0%
> > > 403.gcc                       |  84195991898          | 83730632955 -4.0%
> > > 429.mcf               |  1481381164           | 1367309565 -7.7%
> > > 445.gobmk             |  20123943663          | 19886116394 -1.2%
> > > 456.hmmer             |  12302445139          | 12121745383 -1.5%
> > > 458.sjeng             |  3884712615           | 3755481930  -3.3%
> > > 462.libquantum                |  1966619940           | 1852274342  -5.8%
> > > 464.h264ref           |  19219365552          | 19050288201 ~0.0%
> > > 471.omnetpp           |  45701008325          | 45327805079 ~0.0%
> > > 473.astar             |  4118600354           | 3995943705 -3.0%
> > > 483.xalancbmk         |  179481305182 | 178160306301 ~0.0%
> > >
> > > Pan
> > >
> > >
> > > -----Original Message-----
> > > From: Gcc-patches 
> > > <gcc-patches-bounces+pan2.li=intel....@gcc.gnu.org<mailto:gcc-patches-bounces+pan2.li=intel....@gcc.gnu.org>>
> > >  On Behalf Of ???
> > > Sent: Thursday, April 13, 2023 7:23 AM
> > > To: kito.cheng 
> > > <kito.ch...@gmail.com<mailto:kito.ch...@gmail.com>>; rguenther 
> > > <rguent...@suse.de<mailto:rguent...@suse.de>>
> > > Cc: richard.sandiford 
> > > <richard.sandif...@arm.com<mailto:richard.sandif...@arm.com>>; 
> > > Jeff Law <jeffreya...@gmail.com<mailto:jeffreya...@gmail.com>>; 
> > > gcc-patches 
> > > <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>>; palmer 
> > > <pal...@dabbelt.com<mailto:pal...@dabbelt.com>>; jakub 
> > > <ja...@redhat.com<mailto:ja...@redhat.com>>
> > > Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size 
> > > from 8-bit to 16-bit
> > >
> > > Yeah, like kito said.
> > > Turns out the tuple type model in ARM SVE is the optimal solution for RVV.
> > > And we like ARM SVE style implmentation.
> > >
> > > And now we see swapping rtx_code and mode in rtx_def can make rtx_def 
> > > overal not exceed 64 bit.
> > > But it seems that there is still problem in tree_type_common and 
> > > tree_decl_common, is that right?
> > >
> > > After several trys (remove all redundant TI/TF vector modes and FP16 
> > > vector mode), now there are 252 modes in RISC-V port. Basically, I can 
> > > keep supporting new RVV intrinsisc features recently.
> > > However, we can't support more in the future, for example, FP16 vector, 
> > > BF16 vector, matrix modes, VLS modes,...etc.
> > >
> > > From RVV side, I think extending 1 more bit of machine mode should be 
> > > enough for RVV (overal 512 modes).
> > > Is it possible make it happen in tree_type_common and tree_decl_common, 
> > > Richards?
> > >
> > > Thank you so much for all comments.
> > >
> > >
> > > juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>
> > >
> > > From: Kito Cheng
> > > Date: 2023-04-12 17:31
> > > To: Richard Biener
> > > CC: juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>; 
> > > richard.sandiford; jeffreyalaw; gcc-patches; palmer; jakub
> > > Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size 
> > > from 8-bit to 16-bit
> > > > > The concept of fractional LMUL is the same as the concept of 
> > > > > AArch64's partial SVE vectors, so they can only access the 
> > > > > lowest part, like SVE's partial vector.
> > > > >
> > > > > We want to spill/restore the exact size of those modes (1/2, 
> > > > > 1/4, 1/8), so adding dedicated modes for those partial vector 
> > > > > modes should be unavoidable IMO.
> > > > >
> > > > > And even if we use sub-vector, we still need to define those 
> > > > > partial vector types.
> > > >
> > > > Could you use integer modes for the fractional vectors?
> > >
> > > You mean using the scalar integer mode like using (subreg:SI
> > > (reg:VNx4SI) 0) to represent
> > > LMUL=1/4?
> > > (Assume VNx4SI is mode for M1)
> > >
> > > If so I think it might not be able to model that right - it seems like we 
> > > are using 32-bits but actually we are using poly_int16(1, 1) * 32 bits.
> > >
> > > > For computation you can always appropriately limit the LEN?
> > >
> > > RVV provide zvl*b extension like zvl<N>b (e.g.zvl128b or zvl256b) 
> > > to guarantee the vector length is at least larger than N bits, but 
> > > it's just guarantee the minimal length like SVE guarantee the 
> > > minimal vector length is 128 bits
> > >
> > >
> >
> > --
> > Richard Biener <rguent...@suse.de<mailto:rguent...@suse.de>>
> > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 
> > Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, 
> > Boudien Moerman; HRB 36809 (AG Nuernberg)
>

Reply via email to