[gem5-users] Re: recvAtomicLogic() in mem_ctrl.cc

2023-07-11 Thread John Smith via gem5-users
Thank you Elliot and Ayaz for the well explained responses. I understood
the code snippet now.

On Tue, Jul 11, 2023 at 5:46 PM Ayaz Akram  wrote:

> Hi Eliot,
>
> Based on my understanding, when pkt->makeResponse() is called it updates
> the "cmd" of the pkt with the appropriate responseCommand (this line of
> code: cmd = cmd.responseCommand();) . If you look at
> "MemCmd::commandInfo[]"  in packet.cc, the response command for a
> "WriteReq" command is "WriteResp". And the attributes of a "WriteResp"
> command don't have "HasData", which is why the response pkt will return
> false on a "hasData()" check.
>
> You might also want to look at the struct CommandInfo in packet.hh.
>
> -Ayaz
>
> On Tue, Jul 11, 2023 at 2:15 PM Eliot Moss via gem5-users <
> gem5-users@gem5.org> wrote:
>
>> On 7/11/2023 3:03 PM, John Smith wrote:
>> > Thanks for responding, Elliot. I somewhat understand that after the
>> write is accomplished, the
>> > returning packet won't have the data. But still, why is the returned
>> value 0 in that case? Shouldn't
>> > it still be equal to the memory access latency.
>>
>> In the Atomic case this code is assuming the write can
>> be absorbed into a write buffer, so there is no additional
>> latency visible to the user.  Of course it is *possible* to
>> saturate the buffers, and if you want a more accurate
>> accounting you can use a Timing model instead.
>>
>> EM
>> ___
>> gem5-users mailing list -- gem5-users@gem5.org
>> To unsubscribe send an email to gem5-users-le...@gem5.org
>>
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: recvAtomicLogic() in mem_ctrl.cc

2023-07-11 Thread John Smith via gem5-users
Thanks for responding, Elliot. I somewhat understand that after the write
is accomplished, the returning packet won't have the data. But still, why
is the returned value 0 in that case? Shouldn't it still be equal to the
memory access latency.

On Tue, Jul 11, 2023 at 2:34 PM Eliot Moss  wrote:

> On 7/11/2023 1:28 PM, John Smith via gem5-users wrote:
> > So, I used the function pkt->isWrite() to check if the packet is a write
> request. And I observed
> > that inside the pkt->hasData() if condition, pkt->isWrite() returned
> false. Hence only the read
> > packets were entering the if(pkt->hasData()) condition
>
> So you're saying that inside the if condition, pkt->isWrite is *always*
> false?
>
> I see.  I couldn't find a place in the code (in the version I have
> downloaded
> anyway) where the data is dropped, but I can imagine it happening after the
> write is accomplished (though I don't see why), so that the "returning"
> packet no longer has data.  What are the exact types of the components
> involved?  And maybe someone else is more competent to answer this since it
> is somewhat stumping me from my reading of the code.
>
> Cheers - Eliot
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: recvAtomicLogic() in mem_ctrl.cc

2023-07-11 Thread John Smith via gem5-users
So, I used the function pkt->isWrite() to check if the packet is a write
request. And I observed that inside the pkt->hasData() if condition,
pkt->isWrite() returned false. Hence only the read packets were entering
the if(pkt->hasData()) condition

On Tue, Jul 11, 2023 at 1:10 PM Eliot Moss  wrote:

> On 7/11/2023 1:01 PM, Eliot Moss wrote:
> > On 7/11/2023 12:52 PM, John Smith wrote:
> >> Okay, but I've also noticed that a WriteReq generally carries no data.
> Why exactly is that? Cause
> >> if we are writing to memory, then the memory access latency shouldn't
> be 0 right?
> >
> > I believe that happens if the write got its data by snooping a cache.
> > The packet still goes to the memory, but with the write suppressed.
> > This certainly happens in the Timing case; I admit I'm a little less
> > clear about the Atomic one.
>
> Sorry - I see I was responding about a read.
>
> So, what surprises me is that you're saying that write requests generally
> carry no data.  That doesn't seem right.  What leads you to that
> conclusion?
>
> Best - EM
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: recvAtomicLogic() in mem_ctrl.cc

2023-07-11 Thread John Smith via gem5-users
Okay, but I've also noticed that a WriteReq generally carries no data. Why
exactly is that? Cause if we are writing to memory, then the memory access
latency shouldn't be 0 right?

On Tue, Jul 11, 2023 at 12:49 PM Eliot Moss  wrote:

> On 7/11/2023 12:37 PM, John Smith via gem5-users wrote:
> > Hi everyone,
> >
> > Could someone please help me with explaining what's happening in the
> below code snippet? It's the
> > receiveAtomicLogic() function in mem_ctrl.cc. Why are we returning the
> latency as 0 if the packet
> > doesn't have any data? And in what case will the packet have/not have
> data?
> >
> > // do the actual memory access and turn the packet into a response
> >
> > mem_intr->access(pkt);
> >
> >
> > if (pkt->hasData()) {
> >
> > // this value is not supposed to be accurate, just enough to
> >
> > // keep things going, mimic a closed page
> >
> > // also this latency can't be 0
> >
> > return mem_intr->accessLatency();
> >
> > }
> >
> >
> > return 0;
>
> John - Certain packets carry no data.  For example, a cache line invalidate
> without write back will have that property.  Maybe others.
>
> Best - Eliot
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] recvAtomicLogic() in mem_ctrl.cc

2023-07-11 Thread John Smith via gem5-users
Hi everyone,

Could someone please help me with explaining what's happening in the below
code snippet? It's the receiveAtomicLogic() function in mem_ctrl.cc. Why
are we returning the latency as 0 if the packet doesn't have any data? And
in what case will the packet have/not have data?

// do the actual memory access and turn the packet into a response
mem_intr->access(pkt);

if (pkt->hasData()) {
// this value is not supposed to be accurate, just enough to
// keep things going, mimic a closed page
// also this latency can't be 0
return mem_intr->accessLatency();
}

return 0;

-- 
Regards,
John Smith
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Error: snoop filter exceeded capacity

2023-07-10 Thread John Smith via gem5-users
I'm sorry. Here's the error message I got:

build/X86/mem/snoop_filter.cc:197: panic: panic condition !is_hit &&
(cachedLocations.size() >= maxEntryCount) occurred: snoop filter exceeded
capacity of 131072 cache blocks
Memory Usage: 17708540 KBytes
Program aborted at tick 3777287772000
--- BEGIN LIBC BACKTRACE ---
./build/X86/gem5.opt(+0x18c76d0)[0x55e6842006d0]
./build/X86/gem5.opt(+0x18ebaec)[0x55e684224aec]
/lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7fde9ce42520]
/lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7fde9ce96a7c]
/lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7fde9ce42476]
/lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7fde9ce287f3]
./build/X86/gem5.opt(+0x43b395)[0x55e682d74395]
./build/X86/gem5.opt(+0x12389ad)[0x55e683b719ad]
./build/X86/gem5.opt(+0x11ae404)[0x55e683ae7404]
./build/X86/gem5.opt(+0x131c1bd)[0x55e683c551bd]
./build/X86/gem5.opt(+0x1303f96)[0x55e683c3cf96]
./build/X86/gem5.opt(+0x11afb05)[0x55e683ae8b05]
./build/X86/gem5.opt(+0x1324913)[0x55e683c5d913]
./build/X86/gem5.opt(+0x131d464)[0x55e683c56464]
./build/X86/gem5.opt(+0x1304042)[0x55e683c3d042]
./build/X86/gem5.opt(+0x11afb05)[0x55e683ae8b05]
./build/X86/gem5.opt(+0x1324913)[0x55e683c5d913]
./build/X86/gem5.opt(+0x131d464)[0x55e683c56464]
./build/X86/gem5.opt(+0x1304042)[0x55e683c3d042]
./build/X86/gem5.opt(+0x5cfbce)[0x55e682f08bce]
./build/X86/gem5.opt(+0x5f02c5)[0x55e682f292c5]
./build/X86/gem5.opt(+0xd04b7b)[0x55e68363db7b]
./build/X86/gem5.opt(+0xd3dd08)[0x55e683676d08]
./build/X86/gem5.opt(+0x5ce4cd)[0x55e682f074cd]
./build/X86/gem5.opt(+0x18db7d2)[0x55e6842147d2]
./build/X86/gem5.opt(+0x1904598)[0x55e68423d598]
./build/X86/gem5.opt(+0x1904b83)[0x55e68423db83]
./build/X86/gem5.opt(+0x8868f0)[0x55e6831bf8f0]
./build/X86/gem5.opt(+0x452322)[0x55e682d8b322]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(+0x12b6d3)[0x7fde9df2b6d3]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(_PyObject_Call+0x5c)[0x7fde9dee722c]
/lib/x86_64-linux-gnu/libpython3.10.so.1.0(_PyEval_EvalFrameDefault+0x4b26)[0x7fde9de76766]
--- END LIBC BACKTRACE ---

On Mon, Jul 10, 2023 at 3:52 PM Aaron Vose  wrote:

> Seems like your output was cut off? I don’t see the error message or the
> full command, it seems.
>
>
>
> Cheers,
>
> ~Aaron Vose
>
>
>
> *From:* John Smith via gem5-users 
> *Sent:* Monday, July 10, 2023 3:48 PM
> *To:* gem5-users@gem5.org
> *Cc:* John Smith 
> *Subject:* [gem5-users] Error: snoop filter exceeded capacity
>
>
>
> *This email was sent from outside of MaxLinear.*
>
>
>
> Hi everyone,
>
> I'm facing the below error when running fs.py with the following
> configurations:
>
>
>
> ./build/X86/gem5.opt ./configs/example/fs.py  --cpu-clock=1GHz \
>   -n 8 \
>   --mem-size=16GB  \
>   --caches --l2cache --l3cache
> \
>   --num-l2caches=8
> --num-l3caches=1 \
>   --l1d_size=32kB
> --l1i_size=32kB --l2_size=512kB --l3_size=4MB \
>   --l1d_assoc=8 --l1i_assoc=8
> --l2_assoc=8 --l3_assoc=64 \
>   --cacheline_size=64 \
>
>
>
> It would be great if someone could help me point out what exactly the
> problem is here. I tried running it with -n 4 as well but I'm still facing
> the same problem.
>
>
>
> --
>
> Regards,
>
> John Smith
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Error: snoop filter exceeded capacity

2023-07-10 Thread John Smith via gem5-users
Hi everyone,
I'm facing the below error when running fs.py with the following
configurations:

./build/X86/gem5.opt ./configs/example/fs.py  --cpu-clock=1GHz \
  -n 8 \
  --mem-size=16GB  \
  --caches --l2cache --l3cache \
  --num-l2caches=8
--num-l3caches=1 \
  --l1d_size=32kB
--l1i_size=32kB --l2_size=512kB --l3_size=4MB \
  --l1d_assoc=8 --l1i_assoc=8
--l2_assoc=8 --l3_assoc=64 \
  --cacheline_size=64 \

It would be great if someone could help me point out what exactly the
problem is here. I tried running it with -n 4 as well but I'm still facing
the same problem.


-- 
Regards,
John Smith
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Regarding the recvAtomic() function in mem_ctrl.cc

2023-07-07 Thread John Smith via gem5-users
Okay, I got it. I'll simulate and try it out. I also wanted to ask how I
should go about configuring a 3-level cache hierarchy in fs.py. Could you
please let me know if the following configurations are correct? Do I need
to add any other arguments (that are cache related. I've already added
arguments referring to disk image and the kernel)?

/build/X86/gem5.opt ./configs/example/fs.py --cpu-clock=1GHz \
-n 8 \
--mem-size=16GB \
--num-l2caches=8 --num-l3caches=1 \
--l1d_size=20KB --l1i_size=20KB --l2_size=512KB --l3_size=4MB \
--l1d_assoc=8 --l1i_assoc=8 --l2_assoc=8 --l3_assoc=64 \
--cacheline_size=64 \

On Fri, Jul 7, 2023 at 1:49 PM Ayaz Akram  wrote:

> Hi John,
>
> The recvAtomic() change should work if your memory mode is "Atomic". It
> will not work if you are using "Timing" memory mode. I don't know what
> configuration you are simulating. If the memory mode is "Atomic" with
> Atomic CPU type, I think you should be able to see the impact of your
> change on the cpu.numCycles (total cycles taken to run the simulation). The
> real impact would depend on the characteristics of the benchmark.
>
> -Ayaz
>
> On Fri, Jul 7, 2023 at 10:09 AM John Smith 
> wrote:
>
>> Thanks for the response, Ayaz. I just want to visually see the change in
>> the latencies of memory accesses if I increase the latency by 100 ticks (or
>> more). Is tweaking the frontEnd and backEnd latency the only way for that?
>> It won't work with the change in recvAtomic() ?
>>
>> On Thu, Jul 6, 2023 at 10:46 PM Ayaz Akram  wrote:
>>
>>> Hi John,
>>>
>>> What's the exact stat you are looking at for AMAT? My guess is that it
>>> is not getting updated for Atomic mode memory accesses.
>>>
>>> interface. If I change the code to:
>>>> return mem_intr->accessLatency() + 100;
>>>> Does this mean that it will take 100 more ticks for the memory
>>>> controller to access the memory? If yes, then how can I visualize this
>>>> change?
>>>
>>>
>>> Yes, this means that the response from the controller will be delayed by
>>> 100 ticks. In case you are looking for a more detailed timing model, and
>>> need to use Timing memory accesses, you can do something similar (adding
>>> delay) by tweaking memory controllers' frontEnd and backEnd latency
>>> parameters.
>>>
>>> -Ayaz
>>>
>>> On Thu, Jul 6, 2023 at 4:46 PM John Smith via gem5-users <
>>> gem5-users@gem5.org> wrote:
>>>
>>>> Hi everyone,
>>>> I have a doubt regarding the operation of the recvAtomic() function in
>>>> the memory controller. I can see that recvAtomic() calls recvAtomicLogic(),
>>>> which returns the access latency from the memory interface. If I change the
>>>> code to:
>>>> return mem_intr->accessLatency() + 100;
>>>>
>>>> Does this mean that it will take 100 more ticks for the memory
>>>> controller to access the memory? If yes, then how can I visualize this
>>>> change? The AMAT stats in stats.txt are giving me 'nan' and even with the
>>>> debug flags on, I cant exactly measure this change. Any help would be
>>>> appreciated!
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> John Smith
>>>> ___
>>>> gem5-users mailing list -- gem5-users@gem5.org
>>>> To unsubscribe send an email to gem5-users-le...@gem5.org
>>>>
>>>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Regarding the recvAtomic() function in mem_ctrl.cc

2023-07-06 Thread John Smith via gem5-users
Hi everyone,
I have a doubt regarding the operation of the recvAtomic() function in the
memory controller. I can see that recvAtomic() calls recvAtomicLogic(),
which returns the access latency from the memory interface. If I change the
code to:
return mem_intr->accessLatency() + 100;

Does this mean that it will take 100 more ticks for the memory controller
to access the memory? If yes, then how can I visualize this change? The
AMAT stats in stats.txt are giving me 'nan' and even with the debug flags
on, I cant exactly measure this change. Any help would be appreciated!


-- 
Regards,
John Smith
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Adding a delay of certain ticks in gem5

2023-07-06 Thread John Smith via gem5-users
Okay, I understood. Thanks.

On Thu, Jul 6, 2023 at 12:57 PM Jason Lowe-Power 
wrote:

> Hi John,
>
> The following may be helpful:
>
>
> https://gem5bootcamp.github.io/gem5-bootcamp-env/modules/developing%20gem5%20models/events/
> https://www.youtube.com/watch?v=OcXA1D4b1RA=3868s
>
> Cheers,
> Jason
>
> On Thu, Jul 6, 2023 at 9:53 AM Eliot Moss via gem5-users <
> gem5-users@gem5.org> wrote:
>
>> On 7/6/2023 11:12 AM, John Smith via gem5-users wrote:
>> > Greetings,
>> > If I want to, for example, add a delay of 100 ticks before a line of
>> code executes in the function
>> > handleTimingReqMiss() in cache.cc, how do I go about doing that?
>>
>> Generally speaking, you'll have to schedule an event and then do the
>> rest of the work in the event handler - something like that.  You can't
>> just suspend code in the middle.  You'll probably need to break things
>> into two functions to accomplish this.
>>
>> EM
>> ___
>> gem5-users mailing list -- gem5-users@gem5.org
>> To unsubscribe send an email to gem5-users-le...@gem5.org
>>
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Adding a delay of certain ticks in gem5

2023-07-06 Thread John Smith via gem5-users
I've looked into the schedule() function which is used to schedule events.
But can this function be used to simulate delays?

On Thu, Jul 6, 2023 at 11:12 AM John Smith 
wrote:

> Greetings,
> If I want to, for example, add a delay of 100 ticks before a line of code
> executes in the function handleTimingReqMiss() in cache.cc, how do I go
> about doing that?
>
> --
> Regards,
> John Smith
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Adding a delay of certain ticks in gem5

2023-07-06 Thread John Smith via gem5-users
Greetings,
If I want to, for example, add a delay of 100 ticks before a line of code
executes in the function handleTimingReqMiss() in cache.cc, how do I go
about doing that?

-- 
Regards,
John Smith
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Simulating an additional cache after the LLC in gem5

2023-07-05 Thread John Smith via gem5-users
I want to simulate a cache which intercepts the address from the LLC to the
memory controller and uses that address to update certain information in
the cache. Could anyone help me with how I could go about doing this?

Regards,
Vincent
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Browsing the gem5 codebase

2023-07-05 Thread John Smith via gem5-users
Is there a way to make browsing the gem5 codebase and performing
functionalities like 'Go to Definition' easier?

Thanks,
John
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Running parallel version of a CPU benchmark on multiple cores

2021-04-19 Thread John Smith via gem5-users
Hi,

I am running gem5 with the Gpu model and I am running a Cpu and Gpu
benchmarks simultaneously.

for GPU: 2D Convolution from polybench-gpu.
I have 1+1 Cpus to handle the Gpu's thread launches. Apparently the ROCM
runtime launches an extra thread, so an extra thread is needed (Credit:
Matt Sinclair).
for CPU: running parsec.raytrace (with pthreads and m5thread linked) with 2
threads so I have 2+1 cpus for the CPU benchmark.
My apu_se.py is set up as follows:
...
pid_cnt = 100
cpu_list[0].workload = Process(executable = executable, cmd = [options.cmd]
+ options.options.split(), drivers = [gpu_driver], env =
env, pid=pid_cnt)
cpu_list[0].createThreads()

cpu_list[1].workload = cpu_list[0].workload
cpu_list[1].createThreads()


pid_cnt = 101
process = Process(executable = options.cpu_bench_bin, cmd =
[options.cpu_bench_bin]
+ options.cpu_benchmark_args.split(), env = env, pid=pid_cnt)

cpu_list[2].workload = process
cpu_list[2].createThreads()
cpu_list[3].workload = process
cpu_list[3].createThreads()
cpu_list[4].workload = process
cpu_list[4].createThreads()
...

But, when I look at the generated stats, I can see that Cpu instructions
are committed (non-zero), but Gpu insts committed are 0.
By running 2D Convolution by itself (1+1 cpus) and a trivial cpu benchmark
with no threads, I can see non-zero Gpu insts committed.
What could be the reason for this ?

Thank You,
John Smith
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] Re: Running parallel version of a CPU benchmark on multiple cores

2021-04-16 Thread John Smith via gem5-users
Does that mean I don't have to use m5threads and just use the regular
pthread library ?

On Fri, Apr 16, 2021 at 2:13 PM Jason Lowe-Power 
wrote:

> Hi John,
>
> Yeah, it's something like that. We usually suggest using N + 1 cores where
> N is the number of threads. You can always use more ;).
>
> As a side note, if you configure things correctly (whatever that means...)
> I believe you can get pthreads to work. You can link to the pthreads on the
> host and I think gem5 can correctly execute that code.
>
> Cheers,
> Jason
>
> On Fri, Apr 16, 2021 at 10:42 AM John Smith  wrote:
>
>> That sounds great. In the meantime I will work a bit more on the SE mode.
>> Also do you have any inputs on the following ?
>>
>> m5threads: If there are 9 CPU, and the host CPU launches 9 threads, then
>> are 8 threads launched on the remaining 8 CPUs and the 9th thread has to
>> wait for a
>> thread to complete to begin execution. If not then where does it run as
>> all the 9 CPUs are currently running a thread (1 host + 8 threads).
>>
>> Thank you,
>> John Smith
>>
>> On Fri, Apr 16, 2021 at 12:58 PM Jason Lowe-Power 
>> wrote:
>>
>>> Soon! https://gem5.atlassian.net/browse/GEM5-195
>>>
>>> We're hopeful that in the next month or so all of this code will be
>>> public.
>>>
>>> Cheers,
>>> Jason
>>>
>>> On Fri, Apr 16, 2021 at 9:55 AM John Smith 
>>> wrote:
>>>
>>>> Will I also be able to run the GPU model in the FS mode ?
>>>>
>>>> On Fri, Apr 16, 2021 at 11:39 AM Jason Lowe-Power 
>>>> wrote:
>>>>
>>>>> Hi John,
>>>>>
>>>>> I suggest using full system mode instead of SE mode if you're running
>>>>> a multithreaded workload. In FS mode, there's a full OS so it can handle
>>>>> thread switching, etc. For Parsec on x86 we've created a set of resources
>>>>> for you to get started. See
>>>>> https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/parsec/
>>>>> for details.
>>>>>
>>>>> Cheers,
>>>>> Jason
>>>>>
>>>>> On Fri, Apr 16, 2021 at 8:07 AM John Smith via gem5-users <
>>>>> gem5-users@gem5.org> wrote:
>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> I am sorry for the confusion.
>>>>>>
>>>>>> I am looking to run a multithreaded application on a mesh of 3x3
>>>>>> CPUs, where the benchmark spawns 9 threads and each thread runs on
>>>>>> a single CPU (1:1). I went through the past discussions on this
>>>>>> mailing list and saw that m5threads was needed to do this. I have some
>>>>>> questions.
>>>>>>
>>>>>> (1) If there are 9 CPU, and the host CPU launches 9 threads, then are
>>>>>> 8 threads launched on the remaining 8 CPUs and the 9th thread has to wait
>>>>>> for a
>>>>>> thread to complete to begin execution. If not then where does it run
>>>>>> as all the 9 CPUs are currently running a thread (1 host + 8 threads).
>>>>>>
>>>>>> (2) Anthony Gutierrez said that m5threads is no longer needed. Is
>>>>>> that correct for gem5-21 ?
>>>>>>  (Subject: Simulating multiprogrammed & multithreaded workloads
>>>>>> in SE mode?)
>>>>>>
>>>>>> (3) Right now I am trying to build PARSEC 3.0 benchmarks with
>>>>>> m5threads, but I am receiving some errors as follows and I am not sure 
>>>>>> why:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> *base_dir/local/gcc/bin/gcc -O3 -g -funroll-loops
>>>>>> -fprefetch-loop-arrays  base_dir/gem5dev/parsec-3.0/pkgs/pthread.o
>>>>>> -static-libgcc -Wl,--hash-style=both -Wl,--as-needed
>>>>>> -DPARSEC_VERSION=3.0-beta-20150206 -o siman_tsp siman_tsp.o  -L
>>>>>> base_dir/local/gcc/lib64 -L base_dir/local/gcc/lib ./.libs/libgslsiman.a
>>>>>> ../rng/.libs/libgslrng.a ../ieee-utils/.libs/libgslieeeutils.a
>>>>>> ../err/.libs/libgslerr.a ../sys/.libs/libgslsys.a 
>>>>>> ../utils/.libs/libutils.a
>>>>>> -lpthread -lmbase_dir/gem5dev/parsec-3.0/pkgs/pthread.o: In function
>>>>>> `__pthrea

[gem5-users] Re: Running parallel version of a CPU benchmark on multiple cores

2021-04-16 Thread John Smith via gem5-users
That sounds great. In the meantime I will work a bit more on the SE mode.
Also do you have any inputs on the following ?

m5threads: If there are 9 CPU, and the host CPU launches 9 threads, then
are 8 threads launched on the remaining 8 CPUs and the 9th thread has to
wait for a
thread to complete to begin execution. If not then where does it run as all
the 9 CPUs are currently running a thread (1 host + 8 threads).

Thank you,
John Smith

On Fri, Apr 16, 2021 at 12:58 PM Jason Lowe-Power 
wrote:

> Soon! https://gem5.atlassian.net/browse/GEM5-195
>
> We're hopeful that in the next month or so all of this code will be public.
>
> Cheers,
> Jason
>
> On Fri, Apr 16, 2021 at 9:55 AM John Smith  wrote:
>
>> Will I also be able to run the GPU model in the FS mode ?
>>
>> On Fri, Apr 16, 2021 at 11:39 AM Jason Lowe-Power 
>> wrote:
>>
>>> Hi John,
>>>
>>> I suggest using full system mode instead of SE mode if you're running a
>>> multithreaded workload. In FS mode, there's a full OS so it can handle
>>> thread switching, etc. For Parsec on x86 we've created a set of resources
>>> for you to get started. See
>>> https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/parsec/
>>> for details.
>>>
>>> Cheers,
>>> Jason
>>>
>>> On Fri, Apr 16, 2021 at 8:07 AM John Smith via gem5-users <
>>> gem5-users@gem5.org> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I am sorry for the confusion.
>>>>
>>>> I am looking to run a multithreaded application on a mesh of 3x3 CPUs,
>>>> where the benchmark spawns 9 threads and each thread runs on
>>>> a single CPU (1:1). I went through the past discussions on this mailing
>>>> list and saw that m5threads was needed to do this. I have some questions.
>>>>
>>>> (1) If there are 9 CPU, and the host CPU launches 9 threads, then are 8
>>>> threads launched on the remaining 8 CPUs and the 9th thread has to wait for
>>>> a
>>>> thread to complete to begin execution. If not then where does it run as
>>>> all the 9 CPUs are currently running a thread (1 host + 8 threads).
>>>>
>>>> (2) Anthony Gutierrez said that m5threads is no longer needed. Is that
>>>> correct for gem5-21 ?
>>>>  (Subject: Simulating multiprogrammed & multithreaded workloads in
>>>> SE mode?)
>>>>
>>>> (3) Right now I am trying to build PARSEC 3.0 benchmarks with
>>>> m5threads, but I am receiving some errors as follows and I am not sure why:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *base_dir/local/gcc/bin/gcc -O3 -g -funroll-loops
>>>> -fprefetch-loop-arrays  base_dir/gem5dev/parsec-3.0/pkgs/pthread.o
>>>> -static-libgcc -Wl,--hash-style=both -Wl,--as-needed
>>>> -DPARSEC_VERSION=3.0-beta-20150206 -o siman_tsp siman_tsp.o  -L
>>>> base_dir/local/gcc/lib64 -L base_dir/local/gcc/lib ./.libs/libgslsiman.a
>>>> ../rng/.libs/libgslrng.a ../ieee-utils/.libs/libgslieeeutils.a
>>>> ../err/.libs/libgslerr.a ../sys/.libs/libgslsys.a ../utils/.libs/libutils.a
>>>> -lpthread -lmbase_dir/gem5dev/parsec-3.0/pkgs/pthread.o: In function
>>>> `__pthread_initialize_minimal':pthread.c:(.text+0x97): undefined reference
>>>> to `_dl_phdr'pthread.c:(.text+0xd9): undefined reference to `_dl_phnum'*
>>>>
>>>> Generally how should I go about integrating the m5thread with any
>>>> benchmark?
>>>>
>>>> (4) Also, what other CPU benchmarks are recommended which are
>>>> multithreaded and can be run in a manner where I can
>>>> launch a thread on each CPU ?
>>>>
>>>> Thank You,
>>>> John Smith
>>>>
>>>> <https://www.mail-archive.com/search?l=gem5-users@gem5.org=from:%22Gutierrez%2C+Anthony%22>
>>>>
>>>> On Fri, Apr 16, 2021 at 1:50 AM Gabe Black via gem5-users <
>>>> gem5-users@gem5.org> wrote:
>>>>
>>>>> That's essentially right, although gem5 does have some plumbing to run
>>>>> multiple event queues within the same simulation which can coordinate with
>>>>> each other within a small window (quantum) of time. gem5 has support for
>>>>> fibers/threads/coroutines, but these are not typically used to model
>>>>> events. Events are processed inline when they happen using a simple
>>>>> function call.
>&

[gem5-users] Re: Running parallel version of a CPU benchmark on multiple cores

2021-04-16 Thread John Smith via gem5-users
Will I also be able to run the GPU model in the FS mode ?

On Fri, Apr 16, 2021 at 11:39 AM Jason Lowe-Power 
wrote:

> Hi John,
>
> I suggest using full system mode instead of SE mode if you're running a
> multithreaded workload. In FS mode, there's a full OS so it can handle
> thread switching, etc. For Parsec on x86 we've created a set of resources
> for you to get started. See
> https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/parsec/
> for details.
>
> Cheers,
> Jason
>
> On Fri, Apr 16, 2021 at 8:07 AM John Smith via gem5-users <
> gem5-users@gem5.org> wrote:
>
>> Hi All,
>>
>> I am sorry for the confusion.
>>
>> I am looking to run a multithreaded application on a mesh of 3x3 CPUs,
>> where the benchmark spawns 9 threads and each thread runs on
>> a single CPU (1:1). I went through the past discussions on this mailing
>> list and saw that m5threads was needed to do this. I have some questions.
>>
>> (1) If there are 9 CPU, and the host CPU launches 9 threads, then are 8
>> threads launched on the remaining 8 CPUs and the 9th thread has to wait for
>> a
>> thread to complete to begin execution. If not then where does it run as
>> all the 9 CPUs are currently running a thread (1 host + 8 threads).
>>
>> (2) Anthony Gutierrez said that m5threads is no longer needed. Is that
>> correct for gem5-21 ?
>>  (Subject: Simulating multiprogrammed & multithreaded workloads in SE
>> mode?)
>>
>> (3) Right now I am trying to build PARSEC 3.0 benchmarks with m5threads,
>> but I am receiving some errors as follows and I am not sure why:
>>
>>
>>
>>
>>
>> *base_dir/local/gcc/bin/gcc -O3 -g -funroll-loops -fprefetch-loop-arrays
>>  base_dir/gem5dev/parsec-3.0/pkgs/pthread.o -static-libgcc
>> -Wl,--hash-style=both -Wl,--as-needed -DPARSEC_VERSION=3.0-beta-20150206 -o
>> siman_tsp siman_tsp.o  -L base_dir/local/gcc/lib64 -L
>> base_dir/local/gcc/lib ./.libs/libgslsiman.a ../rng/.libs/libgslrng.a
>> ../ieee-utils/.libs/libgslieeeutils.a ../err/.libs/libgslerr.a
>> ../sys/.libs/libgslsys.a ../utils/.libs/libutils.a -lpthread
>> -lmbase_dir/gem5dev/parsec-3.0/pkgs/pthread.o: In function
>> `__pthread_initialize_minimal':pthread.c:(.text+0x97): undefined reference
>> to `_dl_phdr'pthread.c:(.text+0xd9): undefined reference to `_dl_phnum'*
>>
>> Generally how should I go about integrating the m5thread with any
>> benchmark?
>>
>> (4) Also, what other CPU benchmarks are recommended which are
>> multithreaded and can be run in a manner where I can
>> launch a thread on each CPU ?
>>
>> Thank You,
>> John Smith
>>
>> <https://www.mail-archive.com/search?l=gem5-users@gem5.org=from:%22Gutierrez%2C+Anthony%22>
>>
>> On Fri, Apr 16, 2021 at 1:50 AM Gabe Black via gem5-users <
>> gem5-users@gem5.org> wrote:
>>
>>> That's essentially right, although gem5 does have some plumbing to run
>>> multiple event queues within the same simulation which can coordinate with
>>> each other within a small window (quantum) of time. gem5 has support for
>>> fibers/threads/coroutines, but these are not typically used to model
>>> events. Events are processed inline when they happen using a simple
>>> function call.
>>>
>>> Gabe
>>>
>>> On Thu, Apr 15, 2021 at 2:46 AM gabriel.busnot--- via gem5-users <
>>> gem5-users@gem5.org> wrote:
>>>
>>>> Hi John,
>>>>
>>>> Short answer : no, you can only run several simulations in parallel,
>>>> but not a single simulation using one thread per CPU.
>>>>
>>>> Gem5 relies on Discrete Event Simulation (DES) to simulate the
>>>> concurrent behavior of HW.
>>>> DES is intrinsically sequential in its execution as it relies on
>>>> coroutines (also called user user threads, greed threads, fibers, etc.).
>>>> Parallelizing such application is a very hard task that often requires
>>>> a lot of subtle code transformations to efficiently protect shared
>>>> resources.
>>>> If done correctly, then parallel DES does not have all the good
>>>> properties of classic DES, especially determinism... Unless you add extra
>>>> care to preserve it, which is hard, too. Trust me ;).
>>>>
>>>> This question has been discussed back in the days but seems stalled
>>>> now: http://www.m5sim.org/Parallel_M5
>>>>
>>>> Cheers,
>>>> Gabriel
>>>> ___

[gem5-users] Re: Running parallel version of a CPU benchmark on multiple cores

2021-04-16 Thread John Smith via gem5-users
Hi All,

I am sorry for the confusion.

I am looking to run a multithreaded application on a mesh of 3x3 CPUs,
where the benchmark spawns 9 threads and each thread runs on
a single CPU (1:1). I went through the past discussions on this mailing
list and saw that m5threads was needed to do this. I have some questions.

(1) If there are 9 CPU, and the host CPU launches 9 threads, then are 8
threads launched on the remaining 8 CPUs and the 9th thread has to wait for
a
thread to complete to begin execution. If not then where does it run as all
the 9 CPUs are currently running a thread (1 host + 8 threads).

(2) Anthony Gutierrez said that m5threads is no longer needed. Is that
correct for gem5-21 ?
 (Subject: Simulating multiprogrammed & multithreaded workloads in SE
mode?)

(3) Right now I am trying to build PARSEC 3.0 benchmarks with m5threads,
but I am receiving some errors as follows and I am not sure why:





*base_dir/local/gcc/bin/gcc -O3 -g -funroll-loops -fprefetch-loop-arrays
 base_dir/gem5dev/parsec-3.0/pkgs/pthread.o -static-libgcc
-Wl,--hash-style=both -Wl,--as-needed -DPARSEC_VERSION=3.0-beta-20150206 -o
siman_tsp siman_tsp.o  -L base_dir/local/gcc/lib64 -L
base_dir/local/gcc/lib ./.libs/libgslsiman.a ../rng/.libs/libgslrng.a
../ieee-utils/.libs/libgslieeeutils.a ../err/.libs/libgslerr.a
../sys/.libs/libgslsys.a ../utils/.libs/libutils.a -lpthread
-lmbase_dir/gem5dev/parsec-3.0/pkgs/pthread.o: In function
`__pthread_initialize_minimal':pthread.c:(.text+0x97): undefined reference
to `_dl_phdr'pthread.c:(.text+0xd9): undefined reference to `_dl_phnum'*

Generally how should I go about integrating the m5thread with any
benchmark?

(4) Also, what other CPU benchmarks are recommended which are multithreaded
and can be run in a manner where I can
launch a thread on each CPU ?

Thank You,
John Smith


On Fri, Apr 16, 2021 at 1:50 AM Gabe Black via gem5-users <
gem5-users@gem5.org> wrote:

> That's essentially right, although gem5 does have some plumbing to run
> multiple event queues within the same simulation which can coordinate with
> each other within a small window (quantum) of time. gem5 has support for
> fibers/threads/coroutines, but these are not typically used to model
> events. Events are processed inline when they happen using a simple
> function call.
>
> Gabe
>
> On Thu, Apr 15, 2021 at 2:46 AM gabriel.busnot--- via gem5-users <
> gem5-users@gem5.org> wrote:
>
>> Hi John,
>>
>> Short answer : no, you can only run several simulations in parallel, but
>> not a single simulation using one thread per CPU.
>>
>> Gem5 relies on Discrete Event Simulation (DES) to simulate the concurrent
>> behavior of HW.
>> DES is intrinsically sequential in its execution as it relies on
>> coroutines (also called user user threads, greed threads, fibers, etc.).
>> Parallelizing such application is a very hard task that often requires a
>> lot of subtle code transformations to efficiently protect shared resources.
>> If done correctly, then parallel DES does not have all the good
>> properties of classic DES, especially determinism... Unless you add extra
>> care to preserve it, which is hard, too. Trust me ;).
>>
>> This question has been discussed back in the days but seems stalled now:
>> http://www.m5sim.org/Parallel_M5
>>
>> Cheers,
>> Gabriel
>> ___
>> gem5-users mailing list -- gem5-users@gem5.org
>> To unsubscribe send an email to gem5-users-le...@gem5.org
>> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
>>
> ___
> gem5-users mailing list -- gem5-users@gem5.org
> To unsubscribe send an email to gem5-users-le...@gem5.org
> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] Running parallel version of a CPU benchmark on multiple cores

2021-04-14 Thread John Smith via gem5-users
 Hello Everyone,

Is it possible to run a parallel version of a CPU benchmark like SPEC on
multiple cores?
For example, if I have a mesh of 9 CPU cores, can I launch 9 threads of the
CPU benchmark, 1 thread on each core simultaneously?

Thank you for the information in advance.
John Smith
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s