Re: [gem5-users] ARM with 64 cores FS hanges

2018-05-09 Thread Haiyang Han
Currently v4.4 works fine with the extensions enabled. Thanks alot!

On Wed, May 9, 2018 at 2:23 AM Ciro Santilli <ciro.santi...@gmail.com>
wrote:

> Yes, 16 also needs gem5_extensions = True according to my experiments,
> just like 64.
>
> I think v4.4 should work as you have the gic commit on top, but I didn't
> test.
>
> On Wed, May 9, 2018 at 8:17 AM, Haiyang Han
> <haiyang@eecs.northwestern.edu> wrote:
> > Hi Ciro,
> >
> > Thanks for the reply. I made a mistake in my title. I actually meant the
> > system hangs with 16 cores.
> > Do I still need to set root.system.realview.gic.gem5_extensions to True
> in
> > fs.py?
> > Is the GICv2 extension only compatible with Linux kernel 4.15? Currently
> I'm
> > using 4.4.
> >
> > Thanks,
> > Haiyang
> >
> > On Wed, May 9, 2018 at 2:11 AM Ciro Santilli <ciro.santi...@gmail.com>
> > wrote:
> >>
> >> As mentioned at:
> >> https://www.mail-archive.com/gem5-dev@gem5.org/msg24593.html you need
> >> to:
> >>
> >> - use the ARM linux kernel fork from:
> >> https://gem5.googlesource.com/arm/linux/+/refs/heads/gem5/v4.15 in
> >> particular the GICv2 extension script commit
> >> - hack up gem5:
> >>   - fs.py to set `root.system.realview.gic.gem5_extensions = True`
> >>   - generate a 32 / 64 / etc. dtb if needed (16 already built by
> default)
> >>
> >> The following patch will do both:
> >>
> >> ```
> >> diff --git a/configs/example/fs.py b/configs/example/fs.py
> >> index 4031fd05e..51dd1d4ff 100644
> >> --- a/configs/example/fs.py
> >> +++ b/configs/example/fs.py
> >> @@ -395,5 +395,6 @@ if buildEnv['TARGET_ISA'] == "arm" and
> >> options.generate_dtb:
> >>  sys = getattr(root, sysname)
> >>  sys.dtb_filename = create_dtb_for_system(sys, '%s.dtb' %
> >> sysname)
> >>
> >> +root.system.realview.gic.gem5_extensions = True
> >>  Simulation.setWorkCountOptions(test_sys, options)
> >>  Simulation.run(options, root, test_sys, FutureClass)
> >> diff --git a/system/arm/dt/Makefile b/system/arm/dt/Makefile
> >> index 62cf65f27..66f17a3c4 100644
> >> --- a/system/arm/dt/Makefile
> >> +++ b/system/arm/dt/Makefile
> >> @@ -38,6 +38,8 @@ TARGETS=\
> >>   armv8_gem5_v1_1cpu.dtb armv8_gem5_v1_2cpu.dtb \
> >>   armv8_gem5_v1_4cpu.dtb armv8_gem5_v1_8cpu.dtb \
> >>   armv8_gem5_v1_16cpu.dtb \
> >> + armv8_gem5_v1_32cpu.dtb \
> >> + armv8_gem5_v1_64cpu.dtb \
> >>   armv8_gem5_v1_big_little_2_2.dtb \
> >>   armv8_gem5_v1_big_little_2_4.dtb
> >>
> >> diff --git a/system/arm/dt/armv8.dts b/system/arm/dt/armv8.dts
> >> index 9e07decbd..ddef086a4 100644
> >> --- a/system/arm/dt/armv8.dts
> >> +++ b/system/arm/dt/armv8.dts
> >> @@ -105,6 +105,150 @@
> >>   CPU(15)
> >>   #endif
> >>   #if CONF_CPUS > 16
> >> + CPU(16)
> >> + #endif
> >> + #if CONF_CPUS > 17
> >> + CPU(17)
> >> + #endif
> >> + #if CONF_CPUS > 18
> >> + CPU(18)
> >> + #endif
> >> + #if CONF_CPUS > 19
> >> + CPU(19)
> >> + #endif
> >> + #if CONF_CPUS > 20
> >> + CPU(20)
> >> + #endif
> >> + #if CONF_CPUS > 21
> >> + CPU(21)
> >> + #endif
> >> + #if CONF_CPUS > 22
> >> + CPU(22)
> >> + #endif
> >> + #if CONF_CPUS > 23
> >> + CPU(23)
> >> + #endif
> >> + #if CONF_CPUS > 24
> >> + CPU(24)
> >> + #endif
> >> + #if CONF_CPUS > 25
> >> + CPU(25)
> >> + #endif
> >> + #if CONF_CPUS > 26
> >> + CPU(26)
> >> + #endif
> >> + #if CONF_CPUS > 27
> >> + CPU(27)
> >> + #endif
> >> + #if CONF_CPUS > 28
> >> + CPU(28)
> >> + #endif
> >> + #if CONF_CPUS > 29
> >> + CPU(29)
> >> + #endif
> >> + #if CONF_CPUS > 30
> >> + CPU(30)
> >> + #endif
> >> + #if CONF_CPUS > 31
> >> + CPU(31)
> >> + #endif
> >> + #if CONF_CPUS > 32
> >> + CPU(32)
> >> + #endif
> >> + #if CONF_CPUS > 33
> >> + CPU(33)
> >> + #endif
> >> + #if CONF_CPUS > 34
> >> + CPU(34)
> >> + #endif
> >> + #if CONF_CPUS > 35
> >> + CPU(35)
> >> + #endif
> >> + #if CONF_CPUS > 36
> >> +

[gem5-users] ARM with 64 cores FS hanges

2018-05-08 Thread Haiyang Han
Hi all,

I'm experiencing problems trying to boot up a 16-core ARM system. The
command line I'm using is:
/home/hhu010/tools/gem5/build/ARM/gem5.opt
/home/hhu010/tools/gem5/configs/example/fs.py
--disk-image=/home/hhu010/tools/arm-gem5-rsk/fs_files/disks/aarch64-parsec-3.0-all.img
--caches --l2cache --machine-type=VExpress_GEM5_V1 -n 16

I can boot up with 1, 2 ,4 and 8 cores but with 16 cores, the terminal
doesn't output anything.

Does anyone have a similar problem?

Thanks,
Haiyang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] ARM FS Boot hang

2018-04-13 Thread Haiyang Han
Hi all,

I am trying to boot up a modified version of Gem5 in ARM, FS, multicore
mode. The boot process would hang in the process. Here's the terminal
output:

 m5 slave terminal: Terminal 0 
[0.00] Booting Linux on physical CPU 0x0
[0.00] Initializing cgroup subsys cpu
[0.00] Linux version 4.4.0+ (root@bbdeb8fab105) (gcc version 5.4.0
20160609 (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.4) ) #1 SMP PREEMPT Fri Jun 16
09:13:26 UTC 2017
[0.00] Boot CPU: AArch64 Processor [410fc0f0]
[0.00] Memory limited to 512MB
[0.00] cma: Reserved 16 MiB at 0x9f00
[0.00] On node 0 totalpages: 131072
[0.00]   DMA zone: 2048 pages used for memmap
[0.00]   DMA zone: 0 pages reserved
[0.00]   DMA zone: 131072 pages, LIFO batch:31
[0.00] PERCPU: Embedded 15 pages/cpu @ffc01efd5000 s23320 r8192
d29928 u61440
[0.00] pcpu-alloc: s23320 r8192 d29928 u61440 alloc=15*4096
[0.00] pcpu-alloc: [0] 0 [0] 1
[0.00] Detected PIPT I-cache on CPU0
[0.00] Built 1 zonelists in Zone order, mobility grouping on.
Total pages: 129024
[0.00] Kernel command line: earlyprintk=pl011,0x1c09
console=ttyAMA0 lpj=19988480 norandmaps rw loglevel=8 mem=512MB
root=/dev/sda1
[0.00] PID hash table entries: 2048 (order: 2, 16384 bytes)
[0.00] Dentry cache hash table entries: 65536 (order: 7, 524288
bytes)
[0.00] Inode-cache hash table entries: 32768 (order: 6, 262144
bytes)
[0.00] software IO TLB [mem 0x99c0-0x9dc0] (64MB) mapped at
[ffc019c0-ffc01dbf]
[0.00] Memory: 416308K/524288K available (5342K kernel code, 347K
rwdata, 1964K rodata, 232K init, 237K bss, 91596K reserved, 16384K
cma-reserved)
[0.00] Virtual kernel memory layout:
[0.00] vmalloc : 0xff80 - 0xffbdbfff   (
 246 GB)
[0.00] vmemmap : 0xffbdc000 - 0xffbfc000   (
 8 GB maximum)
[0.00]   0xffbdc200 - 0xffbdc280   (
 8 MB actual)
[0.00] fixed   : 0xffbffa7fd000 - 0xffbffac0   (
4108 KB)
[0.00] PCI I/O : 0xffbffae0 - 0xffbffbe0   (
16 MB)
[0.00] modules : 0xffbffc00 - 0xffc0   (
64 MB)
[0.00] memory  : 0xffc0 - 0xffc02000   (
 512 MB)
[0.00]   .init : 0xffc0007a4000 - 0xffc0007de000   (
 232 KB)
[0.00]   .text : 0xffc8 - 0xffc0007a3b54   (
7311 KB)
[0.00]   .data : 0xffc0007f1000 - 0xffc000847ec0   (
 348 KB)
[0.00] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[0.00] Preemptible hierarchical RCU implementation.
[0.00] Build-time adjustment of leaf fanout to 64.
[0.00] RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=2.
[0.00] RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=2
[0.00] NR_IRQS:64 nr_irqs:64 0
[0.00] Architected cp15 timer(s) running at 25.16MHz (virt).
[0.00] clocksource: arch_sys_counter: mask: 0xff
max_cycles: 0x5cdd39714, max_idle_ns: 440795202620 ns
[0.01] sched_clock: 56 bits at 25MHz, resolution 39ns, wraps every
4398046511084ns
[0.46] Console: colour dummy device 80x25
[0.49] Calibrating delay loop (skipped) preset value.. 9994.24
BogoMIPS (lpj=19988480)
[0.53] pid_max: default: 32768 minimum: 301
[0.75] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes)
[0.79] Mountpoint-cache hash table entries: 1024 (order: 1, 8192
bytes)
[0.000255] ASID allocator initialised with 256 entries
[0.032013] Detected PIPT I-cache on CPU1
[0.032021] CPU1: Booted secondary processor [410fc0f0]
[0.040031] Brought up 2 CPUs
[0.040035] SMP: Total of 2 processors activated.
[0.040039] CPU: All CPU(s) started at EL1
[0.040178] devtmpfs: initialized
[0.048043] clocksource: jiffies: mask: 0x max_cycles:
0x, max_idle_ns: 764504178510 ns
[0.052247] atomic64_test: passed
[0.060722] NET: Registered protocol family 16
[0.080409] vdso: 2 pages (1 code @ ffc0007f9000, 1 data @
ffc0007f8000)
[0.080418] hw-breakpoint: found 2 breakpoint and 2 watchpoint registers.
[0.080773] DMA: preallocated 256 KiB pool for atomic allocations
[0.080777] Serial: AMBA PL011 UART driver
[0.081315] 1c09.uart: ttyAMA0 at MMIO 0x1c09 (irq = 8,
base_baud = 0) is a PL011 rev3
[0.081464] console [ttyAMA0] enabled
[0.112166] vgaarb: loaded
[0.112297] SCSI subsystem initialized
[0.120087] libata version 3.00 loaded.
[0.120189] usbcore: registered new interface driver usbfs
[0.120215] usbcore: registered new interface driver hub
[0.120234] usbcore: registered new device driver usb
[0.120279] pps_core: LinuxPPS API ver. 1 registered
[0.120282] pps_core: Software ver. 

[gem5-users] Status of X86 Multicore Timing/O3 on Classic Memory, FS?

2018-03-24 Thread Haiyang Han
Hi all,

I've read that multicore X86 full system simulations running on timing/o3
cpus with the classic memory system doesn't work with gem5, because
lockedRMW instructions are not implemented in gem5. Has this been fixed in
the latest gem5 versions yet? I think a long time ago a patch(
http://reviews.gem5.org/r/2691/) tried to fix it, but it never got into the
main branch. So I'm assuming the current status is that it still does not
work? It'll be great if someone can clarify this.

Thanks,
Haiyang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Restoring checkpoints with ruby?

2018-03-16 Thread Haiyang Han
Hi all,

I'm trying to create and restore checkpoints with ruby while simulating a
16-core O3CPU, full system, x86 configuration. I can create the checkpoints
with no problem, but a little while after restoring from the checkpoints, I
am seeing all sorts of gem5 aborts due to panics. Sometime it complains
about a possible deadlock, other times it complains that FIFO ordering is
violated. Below is the ruby protocols I tried:

*Protocol used to write checkpoint:  Protocol used to restore:*
MOESI_hammer  MESI_Two_level
MOESI_hammer  MESI_Three_level
MESI_Two_level  MESI_Two_level
MESI_Three_level   MESI_Three_level

I read on http://gem5.org/Checkpoints that only MOESI_hammer supports the
writing of checkpoints. Does this still apply to the newest gem5 versions?
Could it be that the traffic generated by the 16 cores is too much for the
ruby system to handle correctly? Is it possible to solve the deadlock issue
by manually increasing a threshold somewhere in the source code? What about
the FIFO ordering violation? It'll be great if any of these are answered :D

Thanks!
Haiyang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Define Cache Address Ranges?

2018-02-02 Thread Haiyang Han
Hi all,

In the BaseCache the address range is defined as
"VectorParam.AddrRange([AllMemory])". How should I set that if I want 4
separate cache banks with interleaved address? Should I do something like:
addr_ranges = start:finish:high bit used for interleaving:0:number of
interleaving bits:matching value?

Thanks
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Cache that always misses?

2018-01-22 Thread Haiyang Han
Hi,

In the goal of simulating a shared L1 cache, I would like to modify the
source code for the Gem5 cache to create a cache that always misses. What
might be a best approach? Should I just try to forward packets between the
CPU side and the mem side? Or should I try to somehow clear the valid bit
every time a write occurs? (This may be tricky as it might mess up some of
the functional and timing access codes in the cache) Or should I try to put
every request in the MSHR?

Thanks,
Haiyang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Shared L1 Cache and DerivO3CPU

2017-09-15 Thread Haiyang Han
Hi all,

I am trying to simulate a system with a shared L1 cache. I created two
noncoherent crossbars, one for L1D and one for L1I and connected them to
each CPU's dcache_port and icache_port, respectively.  When using an O3CPU
to simulate a full system I get the following error:

gem5.opt: build/X86/mem/xbar.cc:190: bool BaseXBar::Layer::tryTiming(SrcType*) [with SrcType = SlavePort; DstType =
MasterPort]: Assertion `std::find(waitingForLayer.begin(),
waitingForLayer.end(), src_port) == waitingForLayer.end()' failed.

I read in an archived mail that the ports in an O3CPU have to be connected
to a cache and not a crossbar. If so, is there any way that I can still
realize a shared L1 cache?

Thanks,
Haiyang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users