Sure thing, I will pick them all together and trigger(will send out the overall 
diff before start to make sure my understand is correct) the test again. BTW 
which target do we prefer first? X86 or RISC-V.

Pan

From: juzhe.zh...@rivai.ai <juzhe.zh...@rivai.ai>
Sent: Saturday, May 6, 2023 10:00 AM
To: kito.cheng <kito.ch...@gmail.com>; Li, Pan2 <pan2...@intel.com>
Cc: rguenther <rguent...@suse.de>; richard.sandiford 
<richard.sandif...@arm.com>; jeffreyalaw <jeffreya...@gmail.com>; gcc-patches 
<gcc-patches@gcc.gnu.org>; palmer <pal...@dabbelt.com>; jakub <ja...@redhat.com>
Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 
16-bit

Yeah, you should also swap mode and code in rtx_def according to Richard 
suggestion
since it will not change the rtx_def data structure.

I think the only problem is the mode in tree data structure.
________________________________
juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>

From: Kito Cheng<mailto:kito.ch...@gmail.com>
Date: 2023-05-06 09:53
To: Li, Pan2<mailto:pan2...@intel.com>
CC: Richard Biener<mailto:rguent...@suse.de>; 钟居哲<mailto:juzhe.zh...@rivai.ai>; 
richard.sandiford<mailto:richard.sandif...@arm.com>; Jeff 
Law<mailto:jeffreya...@gmail.com>; gcc-patches<mailto:gcc-patches@gcc.gnu.org>; 
palmer<mailto:pal...@dabbelt.com>; jakub<mailto:ja...@redhat.com>
Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to 
16-bit
Hi Pan:

Could you try to apply the following diff and measure again? This
makes tree_type_common size unchanged.


sizeof tree_type_common= 128 (mode = 8 bit)
sizeof tree_type_common= 136 (mode = 16 bit)
sizeof tree_type_common= 128 (mode = 8 bit w/ this diff)

diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index af795aa81f98..b8ccfa407ed9 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -1680,6 +1680,8 @@ struct GTY(()) tree_type_common {
  tree attributes;
  unsigned int uid;

+  ENUM_BITFIELD(machine_mode) mode : 16;
+
  unsigned int precision : 10;
  unsigned no_force_blk_flag : 1;
  unsigned needs_constructing_flag : 1;
@@ -1687,7 +1689,6 @@ struct GTY(()) tree_type_common {
  unsigned restrict_flag : 1;
  unsigned contains_placeholder_bits : 2;

-  ENUM_BITFIELD(machine_mode) mode : 16;

  /* TYPE_STRING_FLAG for INTEGER_TYPE and ARRAY_TYPE.
     TYPE_CXX_ODR_P for RECORD_TYPE and UNION_TYPE.  */
@@ -1712,7 +1713,7 @@ struct GTY(()) tree_type_common {
  unsigned empty_flag : 1;
  unsigned indivisible_p : 1;
  unsigned no_named_args_stdarg_p : 1;
-  unsigned spare : 15;
+  unsigned spare : 7;

  alias_set_type alias_set;
  tree pointer_to;

On Sat, May 6, 2023 at 9:10 AM Li, Pan2 via Gcc-patches
<gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>> wrote:
>
> Yes, totally agree the number cannot be very accurate up to a point. Update 
> the correlated memory bytes allocated for the X86 target.
>
> Bytes allocated with O2:
> -----------------------------------------------------------------------------------------------------
> Benchmark               |  upstream             | with this PATCH
> -----------------------------------------------------------------------------------------------------
> 400.perlbench           | 25286185160           | 25176544846 ~0.0%
> 401.bzip2               | 1429883731            | 1391040027 -2.7%
> 403.gcc                 | 55023568981           | 54798890746 ~0.0%
> 429.mcf         | 1360975660            | 1321537710 -2.9%
> 445.gobmk               | 12791636502           | 12666523431 -1.0%
> 456.hmmer               | 9354433652            | 9279189174 ~0.0%
> 458.sjeng               | 1991260562            | 1944031904 -2.4%
> 462.libquantum          | 1725112078            | 1684213981 -2.4%
> 464.h264ref             | 8597673515            | 8528855778 ~0.0%
> 471.omnetpp             | 37613034778           | 37432278047 ~0.0%
> 473.astar               | 3817295518            | 3772460508 -1.2%
> 483.xalancbmk           | 149418776991  | 148545162207 ~0.0%
>
> Bytes allocated with Ofast + funroll-loops:
> ------------------------------------------------------------------------------------------
> Benchmark               |  upstream             | with this PATCH
> ------------------------------------------------------------------------------------------
> 400.perlbench           | 30438407499           | 30574152897 ~0.0%
> 401.bzip2               | 2277114519            | 2319432664 +1.9%
> 403.gcc                 | 64499664264           | 64781232731 ~0.0%
> 429.mcf         | 1361486758            | 1399942116 +2.8%
> 445.gobmk               | 15258056111           | 15396801542 +1.0%
> 456.hmmer               | 10896615649           | 10936223486 ~0.0%
> 458.sjeng               | 2592620709            | 2641687496 +1.9%
> 462.libquantum          | 1814487525            | 1854518500 +2.2%
> 464.h264ref             | 13528736878           | 13614517066 ~0.0%
> 471.omnetpp             | 38721066702           | 38910524667 ~0.0%
> 473.astar               | 3924015756            | 3968057027 +1.1%
> 483.xalancbmk           | 165897692838  | 166843885880 ~0.0%
>
> Pan
>
>
> -----Original Message-----
> From: Richard Biener <rguent...@suse.de<mailto:rguent...@suse.de>>
> Sent: Friday, May 5, 2023 2:25 PM
> To: Li, Pan2 <pan2...@intel.com<mailto:pan2...@intel.com>>
> Cc: 钟居哲 <juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>>; kito.cheng 
> <kito.ch...@gmail.com<mailto:kito.ch...@gmail.com>>; richard.sandiford 
> <richard.sandif...@arm.com<mailto:richard.sandif...@arm.com>>; Jeff Law 
> <jeffreya...@gmail.com<mailto:jeffreya...@gmail.com>>; gcc-patches 
> <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>>; palmer 
> <pal...@dabbelt.com<mailto:pal...@dabbelt.com>>; jakub 
> <ja...@redhat.com<mailto:ja...@redhat.com>>
> Subject: RE: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit 
> to 16-bit
>
> On Fri, 5 May 2023, Li, Pan2 wrote:
>
> > I tried the memory profiling by valgrind --tool=memcheck 
> > --trace-children=yes for this change, target the SPEC 2006 INT part with 
> > rv64gcv. Note we only count the bytes allocated from valgrind log like this 
> > "==2832896==   total heap usage: 208 allocs, 165 frees, 123,204 bytes 
> > allocated".
> >
> > Consider some variance of valgrind, it looks like the impact to bytes
> > allocated may be limited. However, I am still running this for x86, it
> > will take more than 30 hours for each iteration...
>
> I'm not sure I'd call +- 7% on memory use "limited" - but I fear the numbers 
> are off.  Note since various structures reside in GC memory there's also 
> changes to GC overhead and fragmentation, so precise measurements are 
> difficult.
>
> Richard.
>
> > RISC-V GCC Version:
> > >> ~/bin/test-gnu-8-bits/bin/riscv64-unknown-linux-gnu-gcc --version
> > riscv64-unknown-linux-gnu-gcc (gd7cb9720ed5) 14.0.0 20230503
> > (experimental) Copyright (C) 2023 Free Software Foundation, Inc.
> > This is free software; see the source for copying conditions.  There
> > is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR 
> > PURPOSE.
> >
> > Bytes allocated with O2:
> > -----------------------------------------------------------------------------------------------------
> > Benchmark             |  upstream             | with this PATCH
> > -----------------------------------------------------------------------------------------------------
> > 400.perlbench         | 29699642875           | 29949876269 ~0.0%
> > 401.bzip2             | 1641041659            | 1755563972 +6.95%
> > 403.gcc                       | 68447500516           | 68900883291 ~0.0%
> > 429.mcf               | 1433156462            | 1433253373 ~0.0%
> > 445.gobmk             | 14239225210           | 14463438465 ~0.0%
> > 456.hmmer             | 9635955623            | 9808534948 +1.8%
> > 458.sjeng             | 2419478204            | 2545478940 +5.4%
> > 462.libquantum                | 1686404489            | 1800884197 +6.8%
> > 464.h264ref   8j1     | 10190413900           | 10351134161 +1.6%
> > 471.omnetpp           | 40814627684           | 41185864529 ~0.0%
> > 473.astar             | 3807097529            | 3928428183 +3.2%
> > 483.xalancbmk         | 152959418167  | 154201738843 ~0.0%
> >
> > Bytes allocated with Ofast + funroll-loops:
> > ------------------------------------------------------------------------------------------
> > Benchmark             |  upstream             | with this PATCH
> > ------------------------------------------------------------------------------------------
> > 400.perlbench         |  39491184733          | 39223020267 ~0.0%
> > 401.bzip2             |  2843871517           | 2730383463 ~0%
> > 403.gcc                       |  84195991898          | 83730632955 -4.0%
> > 429.mcf               |  1481381164           | 1367309565 -7.7%
> > 445.gobmk             |  20123943663          | 19886116394 -1.2%
> > 456.hmmer             |  12302445139          | 12121745383 -1.5%
> > 458.sjeng             |  3884712615           | 3755481930  -3.3%
> > 462.libquantum                |  1966619940           | 1852274342  -5.8%
> > 464.h264ref           |  19219365552          | 19050288201 ~0.0%
> > 471.omnetpp           |  45701008325          | 45327805079 ~0.0%
> > 473.astar             |  4118600354           | 3995943705 -3.0%
> > 483.xalancbmk         |  179481305182 | 178160306301 ~0.0%
> >
> > Pan
> >
> >
> > -----Original Message-----
> > From: Gcc-patches 
> > <gcc-patches-bounces+pan2.li=intel....@gcc.gnu.org<mailto:gcc-patches-bounces+pan2.li=intel....@gcc.gnu.org>>
> >  On Behalf Of ???
> > Sent: Thursday, April 13, 2023 7:23 AM
> > To: kito.cheng <kito.ch...@gmail.com<mailto:kito.ch...@gmail.com>>; 
> > rguenther <rguent...@suse.de<mailto:rguent...@suse.de>>
> > Cc: richard.sandiford 
> > <richard.sandif...@arm.com<mailto:richard.sandif...@arm.com>>; Jeff Law
> > <jeffreya...@gmail.com<mailto:jeffreya...@gmail.com>>; gcc-patches 
> > <gcc-patches@gcc.gnu.org<mailto:gcc-patches@gcc.gnu.org>>; palmer
> > <pal...@dabbelt.com<mailto:pal...@dabbelt.com>>; jakub 
> > <ja...@redhat.com<mailto:ja...@redhat.com>>
> > Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from
> > 8-bit to 16-bit
> >
> > Yeah, like kito said.
> > Turns out the tuple type model in ARM SVE is the optimal solution for RVV.
> > And we like ARM SVE style implmentation.
> >
> > And now we see swapping rtx_code and mode in rtx_def can make rtx_def 
> > overal not exceed 64 bit.
> > But it seems that there is still problem in tree_type_common and 
> > tree_decl_common, is that right?
> >
> > After several trys (remove all redundant TI/TF vector modes and FP16 vector 
> > mode), now there are 252 modes in RISC-V port. Basically, I can keep 
> > supporting new RVV intrinsisc features recently.
> > However, we can't support more in the future, for example, FP16 vector, 
> > BF16 vector, matrix modes, VLS modes,...etc.
> >
> > From RVV side, I think extending 1 more bit of machine mode should be 
> > enough for RVV (overal 512 modes).
> > Is it possible make it happen in tree_type_common and tree_decl_common, 
> > Richards?
> >
> > Thank you so much for all comments.
> >
> >
> > juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>
> >
> > From: Kito Cheng
> > Date: 2023-04-12 17:31
> > To: Richard Biener
> > CC: juzhe.zh...@rivai.ai<mailto:juzhe.zh...@rivai.ai>; richard.sandiford; 
> > jeffreyalaw; gcc-patches;
> > palmer; jakub
> > Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from
> > 8-bit to 16-bit
> > > > The concept of fractional LMUL is the same as the concept of
> > > > AArch64's partial SVE vectors, so they can only access the lowest
> > > > part, like SVE's partial vector.
> > > >
> > > > We want to spill/restore the exact size of those modes (1/2, 1/4,
> > > > 1/8), so adding dedicated modes for those partial vector modes
> > > > should be unavoidable IMO.
> > > >
> > > > And even if we use sub-vector, we still need to define those
> > > > partial vector types.
> > >
> > > Could you use integer modes for the fractional vectors?
> >
> > You mean using the scalar integer mode like using (subreg:SI
> > (reg:VNx4SI) 0) to represent
> > LMUL=1/4?
> > (Assume VNx4SI is mode for M1)
> >
> > If so I think it might not be able to model that right - it seems like we 
> > are using 32-bits but actually we are using poly_int16(1, 1) * 32 bits.
> >
> > > For computation you can always appropriately limit the LEN?
> >
> > RVV provide zvl*b extension like zvl<N>b (e.g.zvl128b or zvl256b) to
> > guarantee the vector length is at least larger than N bits, but it's
> > just guarantee the minimal length like SVE guarantee the minimal
> > vector length is 128 bits
> >
> >
>
> --
> Richard Biener <rguent...@suse.de<mailto:rguent...@suse.de>>
> SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, 
> Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 
> 36809 (AG Nuernberg)

Reply via email to