add a "t" at the end of tick number: --dist-sync-start=1000000000000t
On Mon, Dec 11, 2017 at 12:40 PM, Vitorio Cargnini (lcargnini) < [email protected]> wrote: > Thanks Mohammad, > > > > I tried and got the following in the log(log.switch) (below), I also did > try using different orders in the parameters, what should I look for now? > > > > Log: > > gem5 Simulator System. http://gem5.org > > gem5 is copyrighted software; use the --copyright option for details. > > > > gem5 compiled Dec 6 2017 14:35:43 > > gem5 started Dec 11 2017 11:16:10 > > gem5 executing on rndarch11, pid 9841 > > command line: /wada/gem5/build/ARM/gem5.opt -d /wada/gem5/m5out.switch > --debug-flags=DistEthernet /wada/gem5/configs/dist/sw.py > --dist-sync-start=1000000000000 > --checkpoint-dir=/wada/gem5/m5out.switch --is-switch --dist-size=8 > --dist-server-port=2200 > > > > info: Standard input is not a terminal, disabling listeners. > > Global frequency set at 1000000000000 ticks per second > > Traceback (most recent call last): > > File "<string>", line 1, in <module> > > File /wada/gem5/src/python/m5/main.py", line 433, in main > > exec filecode in scope > > File /wada/gem5/configs/dist/sw.py", line 79, in <module> > > main() > > File /wada/gem5/configs/dist/sw.py", line 76, in main > > Simulation.run(options, root, None, None) > > File /wada/gem5/configs/common/Simulation.py", line 589, in run > > m5.instantiate(checkpoint_dir) > > File /wada/gem5/src/python/m5/simulate.py", line 115, in instantiate > > for obj in root.descendants(): obj.createCCObject() > > File /wada/gem5/src/python/m5/SimObject.py", line 1484, in > createCCObject > > self.getCCParams() > > File /wada/gem5/src/python/m5/SimObject.py", line 1439, in getCCParams > > setattr(cc_params, param, value) > > TypeError: (): incompatible function arguments. The following argument > types are supported: > > 1. (self: _m5.param_DistEtherLink.DistEtherLinkParams, arg0: int) -> > None > > > > Invoked with: <_m5.param_DistEtherLink.DistEtherLinkParams object at > 0x7f9a37b8fd80>, 999999999999999983222784L > > > > *From:* gem5-users [mailto:[email protected]] *On Behalf Of > *Mohammad > Alian > *Sent:* Sunday, December 10, 2017 7:54 PM > > *To:* gem5 users mailing list <[email protected]> > *Subject:* Re: [gem5-users] [EXT] Re: Running Dist-gem5 > > > > Oh, you should start synchronization between gem5 nodes before you start > communication inside the simulated cluster. Use "--dist-sync-start" option > to start synchronization before send tick (4428354726000). You should > pass this option to all gem5 processes (FS nodes + switch node). So you > should set --dist-sync-start as a "--cf-args" argument in your launch > script: > > > > --cf-args --dist-sync-start=1000000000000 > > > > > > Best, > > Mohammad > > > > > > > > > > On Fri, Dec 8, 2017 at 12:36 PM, Vitorio Cargnini (lcargnini) < > [email protected]> wrote: > > Thanks Mohammad I made some changes and attempted again, it worked but for > some reason it simplies … dies after a while, not sure why. > > > > Igot the following message on my terminal: > > 0: global: DistIface::startup() done > > info: Entering event queue @ 0. Starting simulation... > > panic: panic condition recv_tick <= curTick() occurred: Simulators out of > sync - missed packet receive by 771635016399 ticks(rev_recv_tick: 0 > send_tick: 4428354726000 send_delay: 257601 linkDelay: 10000000 ) > > Memory Usage: 402472 KBytes > > Program aborted at tick 5200000000000 > > > > > > > > > > On log.switch this is what I got: > > > > **** REAL SIMULATION **** > > 0: system.portlink0: DistEtherLink::startup() called > > 0: global: DistIface::startup() started > > info: Dist sync scheduled at 5200000000000 and repeats 0: global: > DistIface::startup() done > > 10000000 > > 0: system.portlink1: DistEtherLink::startup() called > > 0: global: DistIface::startup() started > > 0: global: DistIface::startup() done > > 0: system.portlink2: DistEtherLink::startup() called > > 0: global: DistIface::startup() started > > 0: global: DistIface::startup() done > > 0: system.portlink3: DistEtherLink::startup() called > > 0: global: DistIface::startup() started > > 0: global: DistIface::startup() done > > 0: system.portlink4: DistEtherLink::startup() called > > 0: global: DistIface::startup() started > > 0: global: DistIface::startup() done > > 0: system.portlink5: DistEtherLink::startup() called > > 0: global: DistIface::startup() started > > 0: global: DistIface::startup() done > > 0: system.portlink6: DistEtherLink::startup() called > > 0: global: DistIface::startup() started > > 0: global: DistIface::startup() done > > 0: system.portlink7: DistEtherLink::startup() called > > 0: global: DistIface::startup() started > > 0: global: DistIface::startup() done > > info: Entering event queue @ 0. Starting simulation... > > panic: panic condition recv_tick <= curTick() occurred: Simulators out of > sync - missed packet receive by 771635016399 ticks(rev_recv_tick: 0 > send_tick: 4428354726000 send_delay: 257601 linkDelay: 10000000 ) > > Memory Usage: 402472 KBytes > > Program aborted at tick 5200000000000 > > > > *From:* gem5-users [mailto:[email protected]] *On Behalf Of > *Mohammad > Alian > *Sent:* Thursday, December 7, 2017 10:00 AM > > > *To:* gem5 users mailing list <[email protected]> > *Subject:* Re: [gem5-users] [EXT] Re: Running Dist-gem5 > > > > Please look at the content of log.* not m5out.*/stats.txt . It's not > surprising that stats.txt is empty ... > > > > On Thu, Dec 7, 2017 at 11:55 AM, Vitorio Cargnini (lcargnini) < > [email protected]> wrote: > > Hi, > > > > The m5out.*/stats.txt from everyone are empty. > > > > However, the m5out.switch/config.ini is filled with: > > It goes from 0 to 7: > > [system.portlink7] > > type=DistEtherLink > > delay=10000000 > > delay_var=0 > > dist_rank=0 > > dist_size=8 > > dist_sync_on_pseudo_op=false > > dump=Null > > eventq_index=0 > > is_switch=true > > num_nodes=8 > > server_name=127.0.0.1 > > server_port=2200 > > speed=800.000000 > > sync_repeat=0 > > sync_start=5200000000000 > > int0=system.interface[7] > > > > I’m thinking if the server_name could be the problem… > > > > > > *From:* gem5-users [mailto:[email protected]] *On Behalf Of > *Mohammad > Alian > *Sent:* Wednesday, December 6, 2017 4:28 PM > *To:* gem5 users mailing list <[email protected]> > *Subject:* Re: [gem5-users] [EXT] Re: Running Dist-gem5 > > > > Again you need to look at log.* to find out why the simulation gets > killed. Don't only look at log.switch. If one of the gem5 processes aborts > then the entire dist-gem5 simulation will be killed. > > > > On Wed, Dec 6, 2017 at 1:50 PM, Vitorio Cargnini (lcargnini) < > [email protected]> wrote: > > Hi Mohammad, > > > > Thank you for the prompt response. I checked the log.switch the first > erros and I fixed was the path, the script needs full-paths to work, so, I > fixed that, once I tried again, it executed and failed a little later. > > > > Got the following output: > > launch switch gem5 process on node0 ... > > waiting for switch to start .. > > node #switch started > > START Wed Dec 6 12:36:04 MST 2017 > > starting gem5 on node0... > > starting gem5 on node0... > > starting gem5 on node1... > > starting gem5 on node1... > > starting gem5 on node2 ... > > starting gem5 on node2 ... > > starting gem5 on node3 ... > > starting gem5 on node3 ... > > (I) (some) gem5 process(es) exited > > KILLED Wed Dec 6 12:37:35 MST 2017 > > ABORT Wed Dec 6 12:37:35 MST 2017 > > > > The log.switch had the following: > > command line: /wada/wada/gem5/build/ARM/gem5.opt -d > /wada/wada/gem5/m5out.switch --debug-flags=DistEthernet > /wada/wada/gem5/configs/dist/sw.py > --checkpoint-dir=/wada/wada/gem5/m5out.switch > --is-switch --dist-size=8 --dist-server-port=2200 > > > > info: Standard input is not a terminal, disabling listeners. > > Global frequency set at 1000000000000 ticks per second > > 0: system.portlink0: DistEtherLink::DistEtherLink() link > delay:10000000 ticksPerByte:800 > > 0: global: DistIface() ctor rank:0 > > info: tcp_iface listening on port 2200 > > Killed by signal 15. > > > > *From:* gem5-users [mailto:[email protected]] *On Behalf Of > *Mohammad > Alian > *Sent:* Tuesday, December 5, 2017 9:18 PM > *To:* gem5 users mailing list <[email protected]> > *Subject:* [EXT] Re: [gem5-users] Running Dist-gem5 > > > > Hi Vitorio, > > > > You should check the content of log.switch and why gem5 node simulating > switch cannot start. There can be so many reasons that a gem5 process fails > to run. If you print the content of switch.log here then I can help. > > > > Regarding "distributed run", you first need to setup passwordless ssh > between your simulation (physical) hosts and then use "LSB_MCPU_HOSTS" env > variable to assign gem5 processes to physical hosts. E.g. if your simulated > cluster size is 8 and you want to run 4 gem5 processes on host_name0 and 4 > on host_name1, then your LSB_MCPU_HOSTS looks like this: > > > > export LSB_MCPU_HOSTS="host_name0 4 host_name1 4" > > > > > > Best, > > Mohammad > > > > > > On Tue, Dec 5, 2017 at 6:03 PM, Vitorio Cargnini (lcargnini) < > [email protected]> wrote: > > Hello, > > > > Please, what exactly do I need to run dist-gem5 with the –-dist? > > > > I’m trying, however it fails with “Failed ot start switch” > > > > Also, what do I need in place for it start distributed acroos nodes, > instead of launching multiple/parallel runs in the ‘localhost’. > > > > Regards, > > Vitorio. > > > > > > > > > > > > > > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > > > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > > > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > > > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >
_______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
