Re: [m5-dev] Memory corruption in m5 dev repository when using --trace-flags="ExecEnable"

2009-04-03 Thread Gabe Black
I tried that command line and I haven't seen any segfault yet. I'll let
it run and see if anything happens. What version of the code are you using?

Gabe

Geoffrey Blake wrote:
>
> I’ve added a couple edits, but nothing major, ie: added statistics to
> the bus model, and some extra latency randomization to cache misses to
> get better averages of parallel code runs.  None of this is tied to
> the trace-flags mechanism that I can determine.  
>
>  
>
> I did run the code through valgrind, but ridiculously enough, the
> segfault disappears. I’ll keep digging in my spare time. 
>
>  
>
> The “Exec” trace flags work fine (billions of instructions, no
> problems) with an old version of m5 that is somewhere between beta4
> and beta5 of the stable releases. Now I can trace maybe a few thousand
> instructions before M5 seg faults.
>
>  
>
> Here is a stripped command line that does expose the bug with the
> least number of variables to consider in case someone out there wants
> to try and duplicate the segfaults I’m seeing (it could be a product
> of my build setup, so I’d appreciate it if someone could verify
> independently):
>
> % m5.opt –trace-flags=”ExecEnable” fs.py –b MutexTest –t –n 1 > /dev/null
>
>  
>
> Geoff
>
>  
>
> *From:* m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] *On
> Behalf Of *Korey Sewell
> *Sent:* Friday, April 03, 2009 9:56 AM
> *To:* M5 Developer List
> *Subject:* Re: [m5-dev] Memory corruption in m5 dev repository when
> using --trace-flags="ExecEnable"
>
>  
>
> I would echo Gabe sentiments. I've been suspicious of the trace-flags
> causing memory corruption for awhile now, but every time I dig into it
> there's some small error that I'm propagating through that finally
> surfaces.
>
> In the big picture, I suspect that the trace-flags just exacerbate any
> kind of memory-corruption issues since you are accessing things at
> such a heavy-rate.
>
> In terms of debugging, is there any code that you edited that is
> tagged when you use "ExecEnable" rather than just "Exec"?
>
> Also, if you can turn valgrind on for maybe the 1st thousand/million
> cycles with ExecEnable you'll probably find something.
>
> On Thu, Apr 2, 2009 at 7:28 PM, Gabriel Michael Black
> mailto:gbl...@eecs.umich.edu>> wrote:
>
> Does this happen when you start tracing sooner? I'd suggest valgrind,
> especially if you can make the segfault happen quickly. If you wait
> for your simulation to get to 14000 ticks in valgrind, you may
> die before you see the result. There's a suppression file in util
> which should cut down on the noise.
>
> Gabe
>
>
> Quoting Geoffrey Blake mailto:bla...@umich.edu>>:
>
> > I stumbled upon what appears to be a memory corruption bug in the
> current M5
> > repository.  If on the command line I enter:
> >
> > % ./build/ALPHA_FS/m5.opt -trace-flags="ExecEnable"
> > -trace-start=14000 fs.py -b  -t -n   > parameters>. The simulator will error with a segmentation fault or
> > occasionally an assert not long after starting to trace instructions.
> >
> >
> >
> > I have run this through gdb in with m5.debug and see the same
> errors, the
> > problem is the stack trace showing the cause of the seg fault or assert
> > changes depending on the inputs to the simulator. So, I have not
> been able
> > to pin point this bug which appears to be a subtle memory corruption
> > somewhere in the code. This error does not happen for other trace
> flags such
> > as the "Cache" trace flag. It appears linked solely to the instruction
> > tracing mechanism.  Has anybody else seen this bug?
> >
> >
> >
> > I'm using an up to date repository I pulled from m5sim.org
>  this morning.
> >
> >
> >
> > Thanks,
> > Geoff
> >
> >
>
> ___
> m5-dev mailing list
> m5-dev@m5sim.org 
> http://m5sim.org/mailman/listinfo/m5-dev
>
>
>
>
> -- 
> --
> Korey L Sewell
> Graduate Student - PhD Candidate
> Computer Science & Engineering
> University of Michigan
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 8.5.285 / Virus Database: 270.11.40/2039 - Release Date:
> 04/03/09 06:19:00
>
> 
>
> ___
> m5-dev mailing list
> m5-dev@m5sim.org
> http://m5sim.org/mailman/listinfo/m5-dev
>   

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Memory corruption in m5 dev repository when using --trace-flags="ExecEnable"

2009-04-03 Thread Geoffrey Blake
I've added a couple edits, but nothing major, ie: added statistics to the
bus model, and some extra latency randomization to cache misses to get
better averages of parallel code runs.  None of this is tied to the
trace-flags mechanism that I can determine.  

 

I did run the code through valgrind, but ridiculously enough, the segfault
disappears. I'll keep digging in my spare time.  

 

The "Exec" trace flags work fine (billions of instructions, no problems)
with an old version of m5 that is somewhere between beta4 and beta5 of the
stable releases. Now I can trace maybe a few thousand instructions before M5
seg faults.

 

Here is a stripped command line that does expose the bug with the least
number of variables to consider in case someone out there wants to try and
duplicate the segfaults I'm seeing (it could be a product of my build setup,
so I'd appreciate it if someone could verify independently):

% m5.opt -trace-flags="ExecEnable" fs.py -b MutexTest -t -n 1 > /dev/null

 

Geoff

 

From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf
Of Korey Sewell
Sent: Friday, April 03, 2009 9:56 AM
To: M5 Developer List
Subject: Re: [m5-dev] Memory corruption in m5 dev repository when using
--trace-flags="ExecEnable"

 

I would echo Gabe sentiments. I've been suspicious of the trace-flags
causing memory corruption for awhile now, but every time I dig into it
there's some small error that I'm propagating through that finally surfaces.

In the big picture, I suspect that the trace-flags just exacerbate any kind
of memory-corruption issues since you are accessing things at such a
heavy-rate.

In terms of debugging, is there any code that you edited that is tagged when
you use "ExecEnable" rather than just "Exec"?

Also, if you can turn valgrind on for maybe the 1st thousand/million cycles
with ExecEnable you'll probably find something.

On Thu, Apr 2, 2009 at 7:28 PM, Gabriel Michael Black
 wrote:

Does this happen when you start tracing sooner? I'd suggest valgrind,
especially if you can make the segfault happen quickly. If you wait
for your simulation to get to 14000 ticks in valgrind, you may
die before you see the result. There's a suppression file in util
which should cut down on the noise.

Gabe


Quoting Geoffrey Blake :

> I stumbled upon what appears to be a memory corruption bug in the current
M5
> repository.  If on the command line I enter:
>
> % ./build/ALPHA_FS/m5.opt -trace-flags="ExecEnable"
> -trace-start=14000 fs.py -b  -t -n   parameters>. The simulator will error with a segmentation fault or
> occasionally an assert not long after starting to trace instructions.
>
>
>
> I have run this through gdb in with m5.debug and see the same errors, the
> problem is the stack trace showing the cause of the seg fault or assert
> changes depending on the inputs to the simulator. So, I have not been able
> to pin point this bug which appears to be a subtle memory corruption
> somewhere in the code. This error does not happen for other trace flags
such
> as the "Cache" trace flag. It appears linked solely to the instruction
> tracing mechanism.  Has anybody else seen this bug?
>
>
>
> I'm using an up to date repository I pulled from m5sim.org this morning.
>
>
>
> Thanks,
> Geoff
>
>



___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev




-- 
--
Korey L Sewell
Graduate Student - PhD Candidate
Computer Science & Engineering
University of Michigan

No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 8.5.285 / Virus Database: 270.11.40/2039 - Release Date: 04/03/09
06:19:00

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Memory corruption in m5 dev repository when using --trace-flags="ExecEnable"

2009-04-03 Thread Korey Sewell
I would echo Gabe sentiments. I've been suspicious of the trace-flags
causing memory corruption for awhile now, but every time I dig into it
there's some small error that I'm propagating through that finally surfaces.

In the big picture, I suspect that the trace-flags just exacerbate any kind
of memory-corruption issues since you are accessing things at such a
heavy-rate.

In terms of debugging, is there any code that you edited that is tagged when
you use "ExecEnable" rather than just "Exec"?

Also, if you can turn valgrind on for maybe the 1st thousand/million cycles
with ExecEnable you'll probably find something.

On Thu, Apr 2, 2009 at 7:28 PM, Gabriel Michael Black  wrote:

> Does this happen when you start tracing sooner? I'd suggest valgrind,
> especially if you can make the segfault happen quickly. If you wait
> for your simulation to get to 14000 ticks in valgrind, you may
> die before you see the result. There's a suppression file in util
> which should cut down on the noise.
>
> Gabe
>
> Quoting Geoffrey Blake :
>
> > I stumbled upon what appears to be a memory corruption bug in the current
> M5
> > repository.  If on the command line I enter:
> >
> > % ./build/ALPHA_FS/m5.opt -trace-flags="ExecEnable"
> > -trace-start=14000 fs.py -b  -t -n   > parameters>. The simulator will error with a segmentation fault or
> > occasionally an assert not long after starting to trace instructions.
> >
> >
> >
> > I have run this through gdb in with m5.debug and see the same errors, the
> > problem is the stack trace showing the cause of the seg fault or assert
> > changes depending on the inputs to the simulator. So, I have not been
> able
> > to pin point this bug which appears to be a subtle memory corruption
> > somewhere in the code. This error does not happen for other trace flags
> such
> > as the "Cache" trace flag. It appears linked solely to the instruction
> > tracing mechanism.  Has anybody else seen this bug?
> >
> >
> >
> > I'm using an up to date repository I pulled from m5sim.org this morning.
> >
> >
> >
> > Thanks,
> > Geoff
> >
> >
>
>
> ___
> m5-dev mailing list
> m5-dev@m5sim.org
> http://m5sim.org/mailman/listinfo/m5-dev
>



-- 
--
Korey L Sewell
Graduate Student - PhD Candidate
Computer Science & Engineering
University of Michigan
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


[m5-dev] Cron /z/m5/regression/do-regression quick

2009-04-03 Thread Cron Daemon
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/o3-timing passed.
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-atomic passed.
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-timing passed.
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/o3-timing passed.
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-atomic passed.
* build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-timing passed.
* build/ALPHA_SE/tests/fast/quick/01.hello-2T-smt/alpha/linux/o3-timing 
passed.
* build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple-atomic 
passed.
* build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple-timing 
passed.
* build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple-atomic-mp 
passed.
* build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple-timing-mp 
passed.
* build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest passed.
* 
build/ALPHA_FS/tests/fast/quick/10.linux-boot/alpha/linux/tsunami-simple-atomic 
passed.
* 
build/ALPHA_FS/tests/fast/quick/10.linux-boot/alpha/linux/tsunami-simple-atomic-dual
 passed.
* 
build/ALPHA_FS/tests/fast/quick/10.linux-boot/alpha/linux/tsunami-simple-timing 
passed.
* 
build/ALPHA_FS/tests/fast/quick/10.linux-boot/alpha/linux/tsunami-simple-timing-dual
 passed.
* 
build/ALPHA_FS/tests/fast/quick/80.netperf-stream/alpha/linux/twosys-tsunami-simple-atomic
 passed.
* build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-atomic passed.
* build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing passed.
* build/SPARC_SE/tests/fast/quick/00.hello/sparc/linux/simple-atomic passed.
* build/SPARC_SE/tests/fast/quick/00.hello/sparc/linux/simple-timing passed.
* build/SPARC_SE/tests/fast/quick/02.insttest/sparc/linux/simple-atomic 
passed.
* build/SPARC_SE/tests/fast/quick/02.insttest/sparc/linux/o3-timing passed.
* build/SPARC_SE/tests/fast/quick/02.insttest/sparc/linux/simple-timing 
passed.
* build/X86_SE/tests/fast/quick/00.hello/x86/linux/simple-atomic passed.
* build/X86_SE/tests/fast/quick/00.hello/x86/linux/simple-timing passed.

See /z/m5/regression/regress-2009-04-03-03:00:01 for details.

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Memory corruption in m5 dev repository when using --trace-flags="ExecEnable"

2009-04-03 Thread nathan binkert
> I have run this through gdb in with m5.debug and see the same errors, the
> problem is the stack trace showing the cause of the seg fault or assert
> changes depending on the inputs to the simulator. So, I have not been able
> to pin point this bug which appears to be a subtle memory corruption
> somewhere in the code. This error does not happen for other trace flags such
> as the “Cache” trace flag. It appears linked solely to the instruction
> tracing mechanism.  Has anybody else seen this bug?

Regressions have been passing and I haven't heard anything about this,
so we'll need more info.  (Or better yet, I'd like to give you moral
support to continue debugging :)

  Nate
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev