> On Jan. 18, 2013, 2:56 p.m., Nilay Vaish wrote:
> > Ali, Steve, can you take a look at this patch?
> > 
> > In my opinion, all the changes being made to x86 tlb are pointless.
> > If we have to replicate everything with in the tlb for each thread, 
> > why not create completely separate tlb structures. The cpu can just 
> > make the translation call to the tlb corresponding to the thread which 
> > needs the translation.
> 
> Steve Reinhardt wrote:
>     I agree, if we're going to replicate all the state in the TLB, we might 
> as well just instantiate multiple TLB objects.
>     
>     I also don't understand why se.py needs to change; doesn't it already 
> support multithreaded jobs?
>     
>     I think the zeroreg fix could be committed as is though (I didn't look at 
> it closely, but if it's a real bug fix it should just go in).
>

I'm not sure about the best way to handle the tlb issue. I don't really know 
how it's done on real cores, but I doubt for ever thread you get another N tlb 
entries. I suppose it's OK if you make each of them smaller. Similarly I'm not 
sure that you want multiple page table walkers. 


- Ali


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://reviews.gem5.org/r/1281/#review3871
-----------------------------------------------------------


On July 2, 2012, 11:21 a.m., Andrea Pellegrini wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://reviews.gem5.org/r/1281/
> -----------------------------------------------------------
> 
> (Updated July 2, 2012, 11:21 a.m.)
> 
> 
> Review request for Default.
> 
> 
> Description
> -------
> 
> Changeset 9074:8a6f47da502a
> ---------------------------
> x86: Fix SMT support (zeroReg, TLBs)
> 
> 
> -----------------------------------------------------------------------------------------------------------------------
> - Fix the zeroReg problem
> 
> There was an issue w/ the rename logic, which would assign a "previous" 
> physical register to the ZeroReg architectural register in x86 (which, BTW, I 
> don't believe exists in x86).  This issue was giving problems for 
> instructions squashed in threads w/ ID different from 0, sometimes allowing 
> non-mispredicted instructions to obtain a value different from zero when 
> reading the zeroReg.
> 
> * changed cpu/o3/rename_map.cc
> 
> -----------------------------------------------------------------------------------------------------------------------
> - Replicated the pre-decoders
> 
> There was an issue w/ the pre-decoders, as already pointed out by she user in 
> the mailing list.
> 
> * The new fetch stage has one decoder per thread, that might have fixed it
> 
> -----------------------------------------------------------------------------------------------------------------------
> - Replicated the TLBs (both ITLB and DTLB)
> 
> This seems to be the simplest solution for x86, and it mimics design choices 
> made for the Hyper-Threading technology in the P4.
> "Hyper-Threading Technology Architecture and Microarchitecture" from Marr et 
> al.
> http://download.intel.com/technology/itj/2002/volume06issue01/art01_hyper/vol6iss1_art01.pdf
> 
> * Added Num threads in: src/arch/x86/X86TLB.py
>   If there is a better way to get this information, please let me know.
> 
> * Added thread information to the page table:
>   I am assigning a did to ask - ask was there and not used. I am not sure if 
> it will work in FS mode (or if it is correct in this model, but it seems to 
> be the more logical and simplest thing to do).
> 
> -----------------------------------------------------------------------------------------------------------------------
> - Changed the exit group sys call so now the program exits only when all the 
> threads terminated
> 
> Is there a better way to handle it?
> 
> -----------------------------------------------------------------------------------------------------------------------
> - Changed the se.py script to support SMT.
> It seems to work for both single and multi -threaded workloads.
> 
> -----------------------------------------------------------------------------------------------------------------------
> 
> Test:
> 
> * Are there any regression tests for x86 SMT?
> 
> * Simple 4 threaded workload:
> 
> Andreas-MacBook-Air:smt apellegr$ ../build/X86/m5.debug 
> ../configs/example/se.py --maxinsts=500000000 --cpu-type=detailed --caches -n 
> 1 -c 
> '/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop;/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop;/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop;/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/hello/bin/x86/linux/hello'
> \gem5 Simulator System.  http://gem5.org
> gem5 is copyrighted software; use the --copyright option for details.
> 
> gem5 compiled Jun 28 2012 17:49:19
> gem5 started Jun 28 2012 17:50:40
> gem5 executing on Andreas-MacBook-Air.local
> command line: ../build/X86/m5.debug ../configs/example/se.py 
> --maxinsts=500000000 --cpu-type=detailed --caches -n 1 -c 
> /Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop;/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop;/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop;/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/hello/bin/x86/linux/hello
> /Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop
> /Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop
> /Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop
> /Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/hello/bin/x86/linux/hello
> Global frequency set at 1000000000000 ticks per second
> 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7002
> 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7003
> 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7004
> 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7005
> **** REAL SIMULATION ****
> info: Entering event queue @ 0.  Starting simulation...
> warn: instruction 'fnstcw_Mw' unimplemented
> warn: instruction 'fldcw_Mw' unimplemented
> Hello world!
> Done!
> Done!
> Done!
> hack: be nice to actually delete the event here
> Exiting @ tick 37617000 because target called exit()
> 
> 
> Diffs
> -----
> 
>   configs/example/se.py f75ee4849c40 
>   src/arch/x86/X86TLB.py f75ee4849c40 
>   src/arch/x86/linux/syscalls.cc f75ee4849c40 
>   src/arch/x86/pagetable.hh f75ee4849c40 
>   src/arch/x86/pagetable.cc f75ee4849c40 
>   src/arch/x86/pagetable_walker.cc f75ee4849c40 
>   src/arch/x86/tlb.hh f75ee4849c40 
>   src/arch/x86/tlb.cc f75ee4849c40 
>   src/cpu/o3/rename_map.cc f75ee4849c40 
> 
> Diff: http://reviews.gem5.org/r/1281/diff/
> 
> 
> Testing
> -------
> 
> Test:
> 
> * Are there any regression tests for x86 SMT?
> 
> * Simple 4 threaded workload:
> 
> Andreas-MacBook-Air:smt apellegr$ ../build/X86/m5.debug 
> ../configs/example/se.py --maxinsts=500000000 --cpu-type=detailed --caches -n 
> 1 -c 
> '/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop;/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop;/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop;/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/hello/bin/x86/linux/hello'
> \gem5 Simulator System.  http://gem5.org
> gem5 is copyrighted software; use the --copyright option for details.
> 
> gem5 compiled Jun 28 2012 17:49:19
> gem5 started Jun 28 2012 17:50:40
> gem5 executing on Andreas-MacBook-Air.local
> command line: ../build/X86/m5.debug ../configs/example/se.py 
> --maxinsts=500000000 --cpu-type=detailed --caches -n 1 -c 
> /Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop;/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop;/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop;/Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/hello/bin/x86/linux/hello
> /Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop
> /Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop
> /Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/loop/bin/x86/linux/loop
> /Users/apellegr/Research/svnrepo/viperII/gem5-viper/tests/test-progs/hello/bin/x86/linux/hello
> Global frequency set at 1000000000000 ticks per second
> 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7002
> 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7003
> 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7004
> 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7005
> **** REAL SIMULATION ****
> info: Entering event queue @ 0.  Starting simulation...
> warn: instruction 'fnstcw_Mw' unimplemented
> warn: instruction 'fldcw_Mw' unimplemented
> Hello world!
> Done!
> Done!
> Done!
> hack: be nice to actually delete the event here
> Exiting @ tick 37617000 because target called exit()
> 
> 
> Thanks,
> 
> Andrea Pellegrini
> 
>

_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to