[gem5-users] Question on how to create a command line options for a switch (router) in garnet2.0

2019-08-14 Thread Kimiya Fasahar
Greetings to all.
I am trying to to create an NoC Pyramidal topology that has a base layer of
64 routers, next higher layers of 16 routers and the apex layer of 4
routers.  I used the command line option num-cpus=64 to get the base
layer.  I discovered I cannot enter the 84 routers at once to later split
them into the required layers.
1. How will I use the command line option to enter the remaining routers in
the other layers?
2. If I want to use routing switches as in simple network, How will I enter
the command line argument?
3. Any other guide on how to create the above topology is welcome.
Thanks and best regards.
Kimiya F
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] CPUID function 0_7 - CacheParams

2019-08-14 Thread Pouya Fotouhi
Great, thanks for the explanations and info!

On Wed, Aug 14, 2019 at 4:32 PM Gabe Black  wrote:

> We might have some support for xsave but I wasn't able to find it. We do
> decode it, but I'm pretty sure we return a WarnUnimplemented instruction. I
> was confusing XSAVE with FXSAVE before, where FXSAVE is part of SSE and
> which we do at least partially support. I don't think we have support for
> XSAVE which is another thing which saves a bunch of processor state as
> selected with a mask and I think an XCR0 register (which we also don't
> support) with variable sizes, etc, etc. Someone could add support for that,
> but it sounds like a lot of work.
>
> Gabe
>
> On Wed, Aug 14, 2019 at 3:55 PM Pouya Fotouhi 
> wrote:
>
>> I'm still digesting some of your points, but in general, something I
>> noticed in "newer" kernels is the "hard-coded" assumption for some of these
>> (specially security related) features (take SMAP as an example). So, to my
>> understanding, if our CPUID simply says "I don't know", in some cases
>> kernel interprets that as a yes rather than a no! So, again to my limited
>> knowledge, I think it'd best to respond negative until we have support for
>> these features.
>>
>> Regarding xsave, if you recall discussions we had about change 19892
>> , our CPUID
>> returns 0x04000209 for 0_1. The most significant set bit we have is bit 26,
>> which tells the kernel we do have support for xsave and then kernel tries
>> to set bit 18 on CR4. Correct me if I'm wrong, but my understanding was
>> that we have "some" support for xsave in gem5. Although looking at my
>> kernel logs, kernel seem to disable it after some tests during SMP boot
>> process (probably our support is not enough for kernel and it masks it off).
>>
>> Best,
>>
>> On Wed, Aug 14, 2019 at 3:28 PM Gabe Black  wrote:
>>
>>> Actually it looks like somebody added a new function while skipping over
>>> the ones below. That's how the unimplemented functions slipped through. I'm
>>> not going to try to implement those for now, but I don't want to discourage
>>> anyone that wants to do something with them.
>>>
>>> I'm also looking into why the kernel thinks we support xsave (which
>>> seems to be fairly complicated) when we do not. I think there's just an
>>> extra bit set in CPUID I need to turn off.
>>>
>>> Gabe
>>>
>>> On Wed, Aug 14, 2019 at 3:01 PM Gabe Black  wrote:
>>>
 I was actually just looking this since I noticed that one of the x86
 kernels I have lying around was crashing with an undefined opcode
 exception. I see that the doCpuid function will just bail out for some of
 the functions which are below the largest it supports (so it can support
 the extended functions). The CPUID instruction will just leave EAX, EBX,
 ECX and EDX unmodified in this case since it isn't supposed to raise any
 type of fault. The kernel will try to interpret those fields as an actual
 answer since we told it those functions were supported, and depending on
 what executed before it do something arbitrary. We should definitely stop
 doing that for starters. I think this is something I partially implemented
 since it was blocking boot a long time ago, and then never went back and
 filled out. For some of these functions we may not have good answers, for
 instance where reporting cache sizes. I'm not sure what to do in that case.
 We may need to look at those fields one by one and try to come up with
 safe, fairly inert answers. If we can return something that says "I don't
 know", that would be best.

 The specific case I'm looking at is function 0xd though, which we would
 have told the kernel we don't support. That's also passing through its
 values which is also giving bad answers.

 I'll put up some CLs which fill out function constants we don't yet
 have, return 0 when we don't get an answer from doCpuid, and start looking
 at what the unimplemented functions should return. We can build on that to
 add in functions that are missing so the kernel at least stops tripping
 over itself when it gets nonsensical answers from CPUID.

 Gabe

 On Wed, Aug 14, 2019 at 2:01 PM Pouya Fotouhi 
 wrote:

> Hi All,
>
> During kernel boot up with the timing/atomic/O3 CPU modes I get the
> following kernel oops at native_flush_tlb_global. Looking closer at the
> issue, Exec traces show:
>
> 2014093750: system.cpu A0 T0 : @native_flush_tlb_global+96: mov
>   eax, 0x2
> 2014093750: system.cpu A0 T0 : @native_flush_tlb_global+96.0  :
> MOV_R_I : limm   eax, 0x2 : IntAlu :  D=0x0002
>  flags=(IsInteger|IsMicroop|IsLastMicroop|IsFirstMicroop)
> 2014094250: system.cpu A0 T0 : @native_flush_tlb_global+101: ud2
> 2014094250: system.cpu A0 T0 : @native_flush_tlb_global+101.0  :   UD2
> : fault   

Re: [gem5-users] CPUID function 0_7 - CacheParams

2019-08-14 Thread Pouya Fotouhi
I'm still digesting some of your points, but in general, something I
noticed in "newer" kernels is the "hard-coded" assumption for some of these
(specially security related) features (take SMAP as an example). So, to my
understanding, if our CPUID simply says "I don't know", in some cases
kernel interprets that as a yes rather than a no! So, again to my limited
knowledge, I think it'd best to respond negative until we have support for
these features.

Regarding xsave, if you recall discussions we had about change 19892
, our CPUID
returns 0x04000209 for 0_1. The most significant set bit we have is bit 26,
which tells the kernel we do have support for xsave and then kernel tries
to set bit 18 on CR4. Correct me if I'm wrong, but my understanding was
that we have "some" support for xsave in gem5. Although looking at my
kernel logs, kernel seem to disable it after some tests during SMP boot
process (probably our support is not enough for kernel and it masks it off).

Best,

On Wed, Aug 14, 2019 at 3:28 PM Gabe Black  wrote:

> Actually it looks like somebody added a new function while skipping over
> the ones below. That's how the unimplemented functions slipped through. I'm
> not going to try to implement those for now, but I don't want to discourage
> anyone that wants to do something with them.
>
> I'm also looking into why the kernel thinks we support xsave (which seems
> to be fairly complicated) when we do not. I think there's just an extra bit
> set in CPUID I need to turn off.
>
> Gabe
>
> On Wed, Aug 14, 2019 at 3:01 PM Gabe Black  wrote:
>
>> I was actually just looking this since I noticed that one of the x86
>> kernels I have lying around was crashing with an undefined opcode
>> exception. I see that the doCpuid function will just bail out for some of
>> the functions which are below the largest it supports (so it can support
>> the extended functions). The CPUID instruction will just leave EAX, EBX,
>> ECX and EDX unmodified in this case since it isn't supposed to raise any
>> type of fault. The kernel will try to interpret those fields as an actual
>> answer since we told it those functions were supported, and depending on
>> what executed before it do something arbitrary. We should definitely stop
>> doing that for starters. I think this is something I partially implemented
>> since it was blocking boot a long time ago, and then never went back and
>> filled out. For some of these functions we may not have good answers, for
>> instance where reporting cache sizes. I'm not sure what to do in that case.
>> We may need to look at those fields one by one and try to come up with
>> safe, fairly inert answers. If we can return something that says "I don't
>> know", that would be best.
>>
>> The specific case I'm looking at is function 0xd though, which we would
>> have told the kernel we don't support. That's also passing through its
>> values which is also giving bad answers.
>>
>> I'll put up some CLs which fill out function constants we don't yet have,
>> return 0 when we don't get an answer from doCpuid, and start looking at
>> what the unimplemented functions should return. We can build on that to add
>> in functions that are missing so the kernel at least stops tripping over
>> itself when it gets nonsensical answers from CPUID.
>>
>> Gabe
>>
>> On Wed, Aug 14, 2019 at 2:01 PM Pouya Fotouhi 
>> wrote:
>>
>>> Hi All,
>>>
>>> During kernel boot up with the timing/atomic/O3 CPU modes I get the
>>> following kernel oops at native_flush_tlb_global. Looking closer at the
>>> issue, Exec traces show:
>>>
>>> 2014093750: system.cpu A0 T0 : @native_flush_tlb_global+96: mov
>>> eax, 0x2
>>> 2014093750: system.cpu A0 T0 : @native_flush_tlb_global+96.0  :
>>> MOV_R_I : limm   eax, 0x2 : IntAlu :  D=0x0002
>>>  flags=(IsInteger|IsMicroop|IsLastMicroop|IsFirstMicroop)
>>> 2014094250: system.cpu A0 T0 : @native_flush_tlb_global+101: ud2
>>> 2014094250: system.cpu A0 T0 : @native_flush_tlb_global+101.0  :   UD2 :
>>> fault   Invalid-Opcode : No_OpClass :
>>> flags=(IsMicroop|IsLastMicroop|IsFirstMicroop)
>>> 2014094500: system.cpu A0 T0 : @native_flush_tlb_global+101.32768 :
>>> Microcode_ROM : slli   t4, t1, 0x4 : IntAlu :  D=0x0060
>>>  flags=(IsInteger|IsMicroop|IsDelayedCommit)
>>>
>>> Looking at  the decode of the "undefined" instruction raising the fault:
>>> 2014094250: system.cpu: Decode: Decoded fault instruction:
>>> {
>>> leg = 0x10,
>>> rex = 0,
>>> vex/xop = 0,
>>> op = {
>>> type = three byte 0f38,
>>> op = 0x82,
>>> },
>>> modRM = 0,
>>> sib = 0,
>>> immediate = 0,
>>> displacement = 0
>>> dispSize = 0}
>>>
>>> Which apparently is  invpcid, and dump of native_flush_tlb_global
>>> confirms:
>>>
>>>0x81033a68 <+96>:mov$0x2,%eax
>>>0x81033a6d <+101>:   

Re: [gem5-users] CPUID function 0_7 - CacheParams

2019-08-14 Thread Gabe Black
Actually it looks like somebody added a new function while skipping over
the ones below. That's how the unimplemented functions slipped through. I'm
not going to try to implement those for now, but I don't want to discourage
anyone that wants to do something with them.

I'm also looking into why the kernel thinks we support xsave (which seems
to be fairly complicated) when we do not. I think there's just an extra bit
set in CPUID I need to turn off.

Gabe

On Wed, Aug 14, 2019 at 3:01 PM Gabe Black  wrote:

> I was actually just looking this since I noticed that one of the x86
> kernels I have lying around was crashing with an undefined opcode
> exception. I see that the doCpuid function will just bail out for some of
> the functions which are below the largest it supports (so it can support
> the extended functions). The CPUID instruction will just leave EAX, EBX,
> ECX and EDX unmodified in this case since it isn't supposed to raise any
> type of fault. The kernel will try to interpret those fields as an actual
> answer since we told it those functions were supported, and depending on
> what executed before it do something arbitrary. We should definitely stop
> doing that for starters. I think this is something I partially implemented
> since it was blocking boot a long time ago, and then never went back and
> filled out. For some of these functions we may not have good answers, for
> instance where reporting cache sizes. I'm not sure what to do in that case.
> We may need to look at those fields one by one and try to come up with
> safe, fairly inert answers. If we can return something that says "I don't
> know", that would be best.
>
> The specific case I'm looking at is function 0xd though, which we would
> have told the kernel we don't support. That's also passing through its
> values which is also giving bad answers.
>
> I'll put up some CLs which fill out function constants we don't yet have,
> return 0 when we don't get an answer from doCpuid, and start looking at
> what the unimplemented functions should return. We can build on that to add
> in functions that are missing so the kernel at least stops tripping over
> itself when it gets nonsensical answers from CPUID.
>
> Gabe
>
> On Wed, Aug 14, 2019 at 2:01 PM Pouya Fotouhi 
> wrote:
>
>> Hi All,
>>
>> During kernel boot up with the timing/atomic/O3 CPU modes I get the
>> following kernel oops at native_flush_tlb_global. Looking closer at the
>> issue, Exec traces show:
>>
>> 2014093750: system.cpu A0 T0 : @native_flush_tlb_global+96: mov
>> eax, 0x2
>> 2014093750: system.cpu A0 T0 : @native_flush_tlb_global+96.0  :   MOV_R_I
>> : limm   eax, 0x2 : IntAlu :  D=0x0002
>>  flags=(IsInteger|IsMicroop|IsLastMicroop|IsFirstMicroop)
>> 2014094250: system.cpu A0 T0 : @native_flush_tlb_global+101: ud2
>> 2014094250: system.cpu A0 T0 : @native_flush_tlb_global+101.0  :   UD2 :
>> fault   Invalid-Opcode : No_OpClass :
>> flags=(IsMicroop|IsLastMicroop|IsFirstMicroop)
>> 2014094500: system.cpu A0 T0 : @native_flush_tlb_global+101.32768 :
>> Microcode_ROM : slli   t4, t1, 0x4 : IntAlu :  D=0x0060
>>  flags=(IsInteger|IsMicroop|IsDelayedCommit)
>>
>> Looking at  the decode of the "undefined" instruction raising the fault:
>> 2014094250: system.cpu: Decode: Decoded fault instruction:
>> {
>> leg = 0x10,
>> rex = 0,
>> vex/xop = 0,
>> op = {
>> type = three byte 0f38,
>> op = 0x82,
>> },
>> modRM = 0,
>> sib = 0,
>> immediate = 0,
>> displacement = 0
>> dispSize = 0}
>>
>> Which apparently is  invpcid, and dump of native_flush_tlb_global
>> confirms:
>>
>>0x81033a68 <+96>:mov$0x2,%eax
>>0x81033a6d <+101>:   invpcid (%rcx),%rax
>>0x81033a72 <+106>:   add$0x18,%rsp
>>
>> We do not implement this instruction, and It seems like this
>> functionality is reported in function 0_7 of CPUID (which we do not
>> implement).
>>
>> I also have a different, yet related, issue with SMAP and FSGSBASE bits
>> (bits 20 and 16 in CR4), where kernel tries to set those resulting in a
>> fault which our CPUs can't handle and kernel panics upon them. These
>> functionalities are also reported by function 0_7 in CPUID which we do not
>> implement
>>
>> I was wondering if it would be safe to simply return 0s for function 0_7?
>> I checked, and I couldn't find anything violating the functionalities we
>> support in gem5. However, I would appreciate if someone more familiar with
>> our support for x86 can double check
>> https://www.sandpile.org/x86/cpuid.htm#level__0007h and verify that
>> returning 0s would be fine here.
>>
>> For the corner case my kernel was hitting, I tested and returning 0s
>> would get me past both these issues. Upon confirmation from someone in the
>> community, I can proceed and submit the change.
>>
>> Best,
>> --
>> Pouya Fotouhi
>> PhD Candidate
>> 

[gem5-users] CPUID function 0_7 - CacheParams

2019-08-14 Thread Pouya Fotouhi
Hi All,

During kernel boot up with the timing/atomic/O3 CPU modes I get the
following kernel oops at native_flush_tlb_global. Looking closer at the
issue, Exec traces show:

2014093750: system.cpu A0 T0 : @native_flush_tlb_global+96: mov
eax, 0x2
2014093750: system.cpu A0 T0 : @native_flush_tlb_global+96.0  :   MOV_R_I :
limm   eax, 0x2 : IntAlu :  D=0x0002
 flags=(IsInteger|IsMicroop|IsLastMicroop|IsFirstMicroop)
2014094250: system.cpu A0 T0 : @native_flush_tlb_global+101: ud2
2014094250: system.cpu A0 T0 : @native_flush_tlb_global+101.0  :   UD2 :
fault   Invalid-Opcode : No_OpClass :
flags=(IsMicroop|IsLastMicroop|IsFirstMicroop)
2014094500: system.cpu A0 T0 : @native_flush_tlb_global+101.32768 :
Microcode_ROM : slli   t4, t1, 0x4 : IntAlu :  D=0x0060
 flags=(IsInteger|IsMicroop|IsDelayedCommit)

Looking at  the decode of the "undefined" instruction raising the fault:
2014094250: system.cpu: Decode: Decoded fault instruction:
{
leg = 0x10,
rex = 0,
vex/xop = 0,
op = {
type = three byte 0f38,
op = 0x82,
},
modRM = 0,
sib = 0,
immediate = 0,
displacement = 0
dispSize = 0}

Which apparently is  invpcid, and dump of native_flush_tlb_global confirms:

   0x81033a68 <+96>:mov$0x2,%eax
   0x81033a6d <+101>:   invpcid (%rcx),%rax
   0x81033a72 <+106>:   add$0x18,%rsp

We do not implement this instruction, and It seems like this functionality
is reported in function 0_7 of CPUID (which we do not implement).

I also have a different, yet related, issue with SMAP and FSGSBASE bits
(bits 20 and 16 in CR4), where kernel tries to set those resulting in a
fault which our CPUs can't handle and kernel panics upon them. These
functionalities are also reported by function 0_7 in CPUID which we do not
implement

I was wondering if it would be safe to simply return 0s for function 0_7? I
checked, and I couldn't find anything violating the functionalities we
support in gem5. However, I would appreciate if someone more familiar with
our support for x86 can double check
https://www.sandpile.org/x86/cpuid.htm#level__0007h and verify that
returning 0s would be fine here.

For the corner case my kernel was hitting, I tested and returning 0s would
get me past both these issues. Upon confirmation from someone in the
community, I can proceed and submit the change.

Best,
-- 
Pouya Fotouhi
PhD Candidate
Department of Electrical and Computer Engineering
University of California, Davis
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users