Re: [gem5-dev] Review Request 3743: cpu: implement L-TAGE branch predictor

2016-12-16 Thread Jason Lowe-Power

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3743/#review9215
---

Ship it!


Ship It!

- Jason Lowe-Power


On Dec. 16, 2016, 3:17 p.m., Arthur Perais wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/3743/
> ---
> 
> (Updated Dec. 16, 2016, 3:17 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 11707:1d085f66c4ca
> ---
> 
> cpu: implement an L-TAGE branch predictor
> 
> This patch implements an L-TAGE predictor, based on André Seznec's code 
> available from 
> CBP-2 
> (http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h). The
> patch also changes the default branch predictor of o3 from the tournament 
> predictor
> to L-TAGE.
> 
> This patch requires patch #3727 (http://reviews.gem5.org/r/3727/) to compile.
> 
> 
> Diffs
> -
> 
>   src/cpu/pred/BranchPredictor.py 1d085f66c4ca 
>   src/cpu/pred/SConscript 1d085f66c4ca 
>   src/cpu/pred/ltage.hh PRE-CREATION 
>   src/cpu/pred/ltage.cc PRE-CREATION 
> 
> Diff: http://reviews.gem5.org/r/3743/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Arthur Perais
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 3743: cpu: implement L-TAGE branch predictor

2016-12-15 Thread Jason Lowe-Power

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3743/#review9209
---



src/cpu/pred/ltage.hh (line 379)


Why not use a vector? This is what it's actually representing, 
correct?

I would argue for using the datatype that makes it most clear to future 
people reading this code.


- Jason Lowe-Power


On Nov. 23, 2016, 2:52 p.m., Arthur Perais wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/3743/
> ---
> 
> (Updated Nov. 23, 2016, 2:52 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 11707:1d085f66c4ca
> ---
> 
> cpu: implement an L-TAGE branch predictor
> 
> This patch implements an L-TAGE predictor, based on André Seznec's code 
> available from 
> CBP-2 
> (http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h). The
> patch also changes the default branch predictor of o3 from the tournament 
> predictor
> to L-TAGE.
> 
> This patch requires patch #3727 (http://reviews.gem5.org/r/3727/) to compile.
> 
> 
> Diffs
> -
> 
>   src/cpu/pred/BranchPredictor.py 1d085f66c4ca 
>   src/cpu/pred/SConscript 1d085f66c4ca 
>   src/cpu/pred/ltage.hh PRE-CREATION 
>   src/cpu/pred/ltage.cc PRE-CREATION 
> 
> Diff: http://reviews.gem5.org/r/3743/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Arthur Perais
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 3743: cpu: implement L-TAGE branch predictor

2016-12-15 Thread Jason Lowe-Power


> On Nov. 23, 2016, 3:17 p.m., Jason Lowe-Power wrote:
> > src/cpu/pred/BranchPredictor.py, line 100
> > 
> >
> > Comment is now wrong.
> > 
> > Also, does it make more sense to call this histBufferEntries (or 
> > histBufferBits) and make the default 2**21? (or would it be 2**21*8? <-- 
> > this is the confusion.) 
> > 
> > I think it would be more clear to use bits or entries, personally. But 
> > if the branch predictor papers usually talk about the history buffer in 
> > terms of bytes instead of bits, I guess the current version makes more 
> > sense.
> 
> Nathanael Premillieu wrote:
> This big buffer is just a software structure and it does not represent a 
> hardware one.
> It is just done like this to simplify the management of the branch 
> history and avoid copying those bits too often.
> The modeled hardware history is implemented as a sliding window on this 
> big buffer.
> 
> Arthur Perais wrote:
> In effect, the predictor will create an array of 2**21 entries, each 
> entry containing an uint8_t, so the buffer is indeed 2MB, even though it 
> contains 2Mbits of history information. As Nathanael pointed out, this is a 
> trick to avoid using a linked-list/deque which is more costly. The buffer is 
> much bigger than the actual history size to remove bound checks and 
> wrap-around adjustment that would be necessary with a circular buffer in 
> *most* places where it is accessed, at the cost of copying the whole history 
> from the beginning to the end of the buffer when the buffer is filled (so it 
> has to be big enough to amortize the cost of the copy).
> 
> If you really think it should be changed to Param.Unsigned, I'll change 
> it. I'll wait for some input before posting the version with the correct 
> comment (my bad).

Yeah, I think it's much better to use an unsigned (int). Otherwise it's not 
clear whether this is the number of bits, or the number of entries, or 
something else.


- Jason


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3743/#review9160
---


On Nov. 23, 2016, 2:52 p.m., Arthur Perais wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/3743/
> ---
> 
> (Updated Nov. 23, 2016, 2:52 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 11707:1d085f66c4ca
> ---
> 
> cpu: implement an L-TAGE branch predictor
> 
> This patch implements an L-TAGE predictor, based on André Seznec's code 
> available from 
> CBP-2 
> (http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h). The
> patch also changes the default branch predictor of o3 from the tournament 
> predictor
> to L-TAGE.
> 
> This patch requires patch #3727 (http://reviews.gem5.org/r/3727/) to compile.
> 
> 
> Diffs
> -
> 
>   src/cpu/pred/BranchPredictor.py 1d085f66c4ca 
>   src/cpu/pred/SConscript 1d085f66c4ca 
>   src/cpu/pred/ltage.hh PRE-CREATION 
>   src/cpu/pred/ltage.cc PRE-CREATION 
> 
> Diff: http://reviews.gem5.org/r/3743/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Arthur Perais
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 3743: cpu: implement L-TAGE branch predictor

2016-11-23 Thread Arthur Perais


> On nov. 23, 2016, 3:17 après-midi, Jason Lowe-Power wrote:
> > src/cpu/pred/BranchPredictor.py, line 100
> > 
> >
> > Comment is now wrong.
> > 
> > Also, does it make more sense to call this histBufferEntries (or 
> > histBufferBits) and make the default 2**21? (or would it be 2**21*8? <-- 
> > this is the confusion.) 
> > 
> > I think it would be more clear to use bits or entries, personally. But 
> > if the branch predictor papers usually talk about the history buffer in 
> > terms of bytes instead of bits, I guess the current version makes more 
> > sense.
> 
> Nathanael Premillieu wrote:
> This big buffer is just a software structure and it does not represent a 
> hardware one.
> It is just done like this to simplify the management of the branch 
> history and avoid copying those bits too often.
> The modeled hardware history is implemented as a sliding window on this 
> big buffer.

In effect, the predictor will create an array of 2**21 entries, each entry 
containing an uint8_t, so the buffer is indeed 2MB, even though it contains 
2Mbits of history information. As Nathanael pointed out, this is a trick to 
avoid using a linked-list/deque which is more costly. The buffer is much bigger 
than the actual history size to remove bound checks and wrap-around adjustment 
that would be necessary with a circular buffer in *most* places where it is 
accessed, at the cost of copying the whole history from the beginning to the 
end of the buffer when the buffer is filled (so it has to be big enough to 
amortize the cost of the copy).

If you really think it should be changed to Param.Unsigned, I'll change it. 
I'll wait for some input before posting the version with the correct comment 
(my bad).


- Arthur


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3743/#review9160
---


On nov. 23, 2016, 2:52 après-midi, Arthur Perais wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/3743/
> ---
> 
> (Updated nov. 23, 2016, 2:52 après-midi)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 11707:1d085f66c4ca
> ---
> 
> cpu: implement an L-TAGE branch predictor
> 
> This patch implements an L-TAGE predictor, based on André Seznec's code 
> available from 
> CBP-2 
> (http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h). The
> patch also changes the default branch predictor of o3 from the tournament 
> predictor
> to L-TAGE.
> 
> This patch requires patch #3727 (http://reviews.gem5.org/r/3727/) to compile.
> 
> 
> Diffs
> -
> 
>   src/cpu/pred/BranchPredictor.py 1d085f66c4ca 
>   src/cpu/pred/SConscript 1d085f66c4ca 
>   src/cpu/pred/ltage.hh PRE-CREATION 
>   src/cpu/pred/ltage.cc PRE-CREATION 
> 
> Diff: http://reviews.gem5.org/r/3743/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Arthur Perais
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 3743: cpu: implement L-TAGE branch predictor

2016-11-23 Thread Nathanael Premillieu


> On Nov. 23, 2016, 3:17 p.m., Jason Lowe-Power wrote:
> > src/cpu/pred/BranchPredictor.py, line 100
> > 
> >
> > Comment is now wrong.
> > 
> > Also, does it make more sense to call this histBufferEntries (or 
> > histBufferBits) and make the default 2**21? (or would it be 2**21*8? <-- 
> > this is the confusion.) 
> > 
> > I think it would be more clear to use bits or entries, personally. But 
> > if the branch predictor papers usually talk about the history buffer in 
> > terms of bytes instead of bits, I guess the current version makes more 
> > sense.

This big buffer is just a software structure and it does not represent a 
hardware one.
It is just done like this to simplify the management of the branch history and 
avoid copying those bits too often.
The modeled hardware history is implemented as a sliding window on this big 
buffer.


- Nathanael


---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3743/#review9160
---


On Nov. 23, 2016, 2:52 p.m., Arthur Perais wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/3743/
> ---
> 
> (Updated Nov. 23, 2016, 2:52 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 11707:1d085f66c4ca
> ---
> 
> cpu: implement an L-TAGE branch predictor
> 
> This patch implements an L-TAGE predictor, based on André Seznec's code 
> available from 
> CBP-2 
> (http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h). The
> patch also changes the default branch predictor of o3 from the tournament 
> predictor
> to L-TAGE.
> 
> This patch requires patch #3727 (http://reviews.gem5.org/r/3727/) to compile.
> 
> 
> Diffs
> -
> 
>   src/cpu/pred/BranchPredictor.py 1d085f66c4ca 
>   src/cpu/pred/SConscript 1d085f66c4ca 
>   src/cpu/pred/ltage.hh PRE-CREATION 
>   src/cpu/pred/ltage.cc PRE-CREATION 
> 
> Diff: http://reviews.gem5.org/r/3743/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Arthur Perais
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 3743: cpu: implement L-TAGE branch predictor

2016-11-23 Thread Nathanael Premillieu

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3743/#review9159
---

Ship it!


Ship It!

- Nathanael Premillieu


On Nov. 23, 2016, 2:52 p.m., Arthur Perais wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/3743/
> ---
> 
> (Updated Nov. 23, 2016, 2:52 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 11707:1d085f66c4ca
> ---
> 
> cpu: implement an L-TAGE branch predictor
> 
> This patch implements an L-TAGE predictor, based on André Seznec's code 
> available from 
> CBP-2 
> (http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h). The
> patch also changes the default branch predictor of o3 from the tournament 
> predictor
> to L-TAGE.
> 
> This patch requires patch #3727 (http://reviews.gem5.org/r/3727/) to compile.
> 
> 
> Diffs
> -
> 
>   src/cpu/pred/BranchPredictor.py 1d085f66c4ca 
>   src/cpu/pred/SConscript 1d085f66c4ca 
>   src/cpu/pred/ltage.hh PRE-CREATION 
>   src/cpu/pred/ltage.cc PRE-CREATION 
> 
> Diff: http://reviews.gem5.org/r/3743/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Arthur Perais
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 3743: cpu: implement L-TAGE branch predictor

2016-11-23 Thread Arthur Perais

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3743/
---

(Updated nov. 23, 2016, 2:52 après-midi)


Review request for Default.


Changes
---

Modified patch according to comments (Tournament is still default, added 
comments, history size).


Repository: gem5


Description
---

Changeset 11707:1d085f66c4ca
---

cpu: implement an L-TAGE branch predictor

This patch implements an L-TAGE predictor, based on André Seznec's code 
available from 
CBP-2 (http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h). 
The
patch also changes the default branch predictor of o3 from the tournament 
predictor
to L-TAGE.

This patch requires patch #3727 (http://reviews.gem5.org/r/3727/) to compile.


Diffs (updated)
-

  src/cpu/pred/BranchPredictor.py 1d085f66c4ca 
  src/cpu/pred/SConscript 1d085f66c4ca 
  src/cpu/pred/ltage.hh PRE-CREATION 
  src/cpu/pred/ltage.cc PRE-CREATION 

Diff: http://reviews.gem5.org/r/3743/diff/


Testing
---


Thanks,

Arthur Perais

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 3743: cpu: implement L-TAGE branch predictor

2016-11-22 Thread Arthur Perais
Jason, thanks for the feedback, 

I'll do my best to address everything tomorrow, but I've added some replies to 
your comments inline. 

- Mail original -

> De: "Jason Lowe-Power" 
> À: "Default" , "Jason Lowe-Power" ,
> "Arthur Perais" 
> Envoyé: Mardi 22 Novembre 2016 19:15:03
> Objet: Re: Review Request 3743: cpu: implement L-TAGE branch predictor

> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/3743/

> A few thing below. Overall, could you please add more comments? No need to
> write the entire TAGE paper in the comments, but a little high-level
> information would help new people coming to this code a lot.

Will do. 

> src/cpu/o3/O3CPU.py (Diff revision 1)
> def support_take_over(cls):
> 142
> branchPred = Param . BranchPredictor ( TournamentBP ( numThreads =
> 142
> branchPred = Param . BranchPredictor ( LTAGE ( numThreads =

> I'm not sure we want to make the TAGE predictor the default. Are there any
> products that use something like the TAGE predictor?

> Alternatively, if the tournamentBP is much worse than current products, I can
> see the argument for make the TAGE predictor the default.

Well, officially no, but then again, Seznec is the only guy that has an Intel 
Research Impact Medal so maybe :) In addition, perceptron is known to be in AMD 
(Zen I think?) and Samsung (Mongoose, which is mobile-class) CPUs, and its 
performance is ~comparable to TAGE, so in that sense, it is reasonable to have 
a high performing branch predictor as the default for the out-of-order CPU. I'm 
not pushing for it though, totally fine to keep the tournament as default. 

> src/cpu/pred/BranchPredictor.py (Diff revision 1) 99
> histBufferSize = Param . MemorySize ( "128MB" ,

> Is the global history really 128 megabytes?

> I'm not very familiar with the TAGE predictor. Does a real implementation
> somehow reduce this size and you're making a simplifying assumtion here?

I'm not quite sure what is happening here, but 128MB is way overkill. TAGE 
typically tracks the past ~100 - ~1000 last outcomes. Will look into it. 

> src/cpu/pred/ltage.hh (Diff revision 1) 181
> int bindex ( Addr pc_in ) const ;

> Can you add some comments on all of these functions? Doxygen comments would
> be best.

> src/cpu/pred/ltage.cc (Diff revision 1) 68
> history . globalHistory = new uint8_t [ histBufferSize ];
> I'm not sure if you're really using "histBufferSize" as a "MemorySize". This
> seems to just be a count that defaults to 2**27. Is this really a
> MemorySize?

Shouldn't be. Will look into it as well. 

> - Jason Lowe-Power

> On November 22nd, 2016, 2:31 p.m. UTC, Arthur Perais wrote:
> Review request for Default.
> By Arthur Perais.

> Updated Nov. 22, 2016, 2:31 p.m.
> Repository: gem5
> Description
> Changeset 11707:1d085f66c4ca

> cpu: implement an L-TAGE branch predictor

> This patch implements an L-TAGE predictor, based on André Seznec's code
> available from CBP-2
> (http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h).
> The patch also changes the default branch predictor of o3 from the
> tournament predictor to L-TAGE.

> This patch requires patch #3727 (http://reviews.gem5.org/r/3727/) to compile.
> Diffs

> * src/cpu/o3/O3CPU.py (1d085f66c4ca)
> * src/cpu/pred/BranchPredictor.py (1d085f66c4ca)
> * src/cpu/pred/SConscript (1d085f66c4ca)
> * src/cpu/pred/ltage.hh (PRE-CREATION)
> * src/cpu/pred/ltage.cc (PRE-CREATION)

> View Diff
___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [gem5-dev] Review Request 3743: cpu: implement L-TAGE branch predictor

2016-11-22 Thread Jason Lowe-Power

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/3743/#review9154
---


A few thing below. Overall, could you please add more comments? No need to 
write the entire TAGE paper in the comments, but a little high-level 
information would help new people coming to this code a lot.


src/cpu/o3/O3CPU.py (line 142)


I'm not sure we want to make the TAGE predictor the default. Are there any 
products that use something like the TAGE predictor?

Alternatively, if the tournamentBP is *much* worse than current products, I 
can see the argument for make the TAGE predictor the default.



src/cpu/pred/BranchPredictor.py (line 99)


Is the global history really 128 megabytes?

I'm not very familiar with the TAGE predictor. Does a real implementation 
somehow reduce this size and you're making a simplifying assumtion here?



src/cpu/pred/ltage.hh (line 181)


Can you add some comments on all of these functions? Doxygen comments would 
be best.



src/cpu/pred/ltage.cc (line 68)


I'm not sure if you're really using "histBufferSize" as a "MemorySize". 
This seems to just be a count that defaults to 2**27. Is this really a 
MemorySize?


- Jason Lowe-Power


On Nov. 22, 2016, 2:31 p.m., Arthur Perais wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/3743/
> ---
> 
> (Updated Nov. 22, 2016, 2:31 p.m.)
> 
> 
> Review request for Default.
> 
> 
> Repository: gem5
> 
> 
> Description
> ---
> 
> Changeset 11707:1d085f66c4ca
> ---
> 
> cpu: implement an L-TAGE branch predictor
> 
> This patch implements an L-TAGE predictor, based on André Seznec's code 
> available from 
> CBP-2 
> (http://hpca23.cse.tamu.edu/taco/camino/cbp2/cbp-src/realistic-seznec.h). The
> patch also changes the default branch predictor of o3 from the tournament 
> predictor
> to L-TAGE.
> 
> This patch requires patch #3727 (http://reviews.gem5.org/r/3727/) to compile.
> 
> 
> Diffs
> -
> 
>   src/cpu/o3/O3CPU.py 1d085f66c4ca 
>   src/cpu/pred/BranchPredictor.py 1d085f66c4ca 
>   src/cpu/pred/SConscript 1d085f66c4ca 
>   src/cpu/pred/ltage.hh PRE-CREATION 
>   src/cpu/pred/ltage.cc PRE-CREATION 
> 
> Diff: http://reviews.gem5.org/r/3743/diff/
> 
> 
> Testing
> ---
> 
> 
> Thanks,
> 
> Arthur Perais
> 
>

___
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev