Re: [gem5-users] problem with benchmarks which uses '<'

Nilay Vaish Sat, 28 Apr 2012 09:10:37 -0700

Check why the size of the memory being asked for is so high? It might bethat there is some problem with the way 'length' is found in mmapFunc().Or it could be that the application did request a huge chunk of memory tobe mmapped.


--
Nilay


On Sat, 28 Apr 2012, Mahmood Naderan wrote:

Isn't there any idea about my finding?


On 4/27/12, Mahmood Naderan <[email protected]> wrote:

ok I think I find the bug. I used "continue" and "ctrl+c" multiple
times to see if it stuck at a particular function. The backtrace
shows:


#0  0x00000000004dfee7 in __gnu_cxx::hashtable<std::pair<unsigned long
const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned
long>, std::_Select1st<std::pair<unsigned long const,
X86ISA::TlbEntry> >, std::equal_to<unsigned long>,
std::allocator<X86ISA::TlbEntry> >::_M_bkt_num_key (this=0x28b8970,
    __key=@0x21d9a7c8, __n=50331653) at
/usr/include/c++/4.4/backward/hashtable.h:590
#1  0x00000000004dfff9 in __gnu_cxx::hashtable<std::pair<unsigned long
const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned
long>, std::_Select1st<std::pair<unsigned long const,
X86ISA::TlbEntry> >, std::equal_to<unsigned long>,
std::allocator<X86ISA::TlbEntry> >::_M_bkt_num (this=0x28b8970,
    __obj=..., __n=50331653) at
/usr/include/c++/4.4/backward/hashtable.h:594
#2  0x00000000004df9c8 in __gnu_cxx::hashtable<std::pair<unsigned long
const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned
long>, std::_Select1st<std::pair<unsigned long const,
X86ISA::TlbEntry> >, std::equal_to<unsigned long>,
std::allocator<X86ISA::TlbEntry> >::resize (this=0x28b8970,
    __num_elements_hint=25165844) at
/usr/include/c++/4.4/backward/hashtable.h:1001
#3  0x00000000004df100 in __gnu_cxx::hashtable<std::pair<unsigned long
const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned
long>, std::_Select1st<std::pair<unsigned long const,
X86ISA::TlbEntry> >, std::equal_to<unsigned long>,
std::allocator<X86ISA::TlbEntry> >::find_or_insert (this=0x28b8970,
    __obj=...) at /usr/include/c++/4.4/backward/hashtable.h:789
#4  0x00000000004deaca in __gnu_cxx::hash_map<unsigned long,
X86ISA::TlbEntry, __gnu_cxx::hash<unsigned long>,
std::equal_to<unsigned long>, std::allocator<X86ISA::TlbEntry>

::operator[] (this=0x28b8970,

    __key=@0x7fffffffba80) at /usr/include/c++/4.4/ext/hash_map:217
#5  0x00000000004daa68 in PageTable::map (this=0x28b8970,
vaddr=47015569313792, paddr=103079288832,
    size=5548434767986339840, clobber=false) at
build/X86/mem/page_table.cc:82
#6  0x000000000074b9c8 in Process::allocateMem (this=0x30be640,
vaddr=46912496128000,
    size=5548434871059525632, clobber=false) at
build/X86/sim/process.cc:332
#7  0x00000000007aba21 in mmapFunc<X86Linux64> (desc=0x2052fb8, num=9,
p=0x30be640, tc=0x3331210)
    at build/X86/sim/syscall_emul.hh:1069
#8  0x000000000073ca11 in SyscallDesc::doSyscall (this=0x2052fb8,
callnum=9, process=0x30be640,
    tc=0x3331210) at build/X86/sim/syscall_emul.cc:69
#9  0x00000000007516a0 in LiveProcess::syscall (this=0x30be640,
callnum=9, tc=0x3331210)
    at build/X86/sim/process.cc:590
#10 0x0000000000c10ce3 in SimpleThread::syscall (this=0x33305d0, callnum=9)
    at build/X86/cpu/simple_thread.hh:384



As you can see there is a problem with mmapFunc<X86Linux64> syscall
which allocate memory through Process::allocateMem
That is my understanding....



On 4/27/12, Mahmood Naderan <[email protected]> wrote:

Is this useful?

339051500: system.cpu + A0 T0 : 0x83d48d.4  :   CALL_NEAR_I : wrip   ,
t7, t1 : IntAlu :
339052000: system.cpu.icache: ReadReq (ifetch) 452f90 hit
339052000: system.cpu + A0 T0 : 0x852f90    : mov       r10, rcx
339052000: system.cpu + A0 T0 : 0x852f90.0  :   MOV_R_R : mov   r10,
r10, rcx : IntAlu :  D=0x0000000000000022
339052500: system.cpu.icache: ReadReq (ifetch) 452f90 hit
339052500: system.cpu + A0 T0 : 0x852f93    : mov       eax, 0x9
339052500: system.cpu + A0 T0 : 0x852f93.0  :   MOV_R_I : limm   eax,
0x9 : IntAlu :  D=0x0000000000000009
339053000: system.cpu.icache: ReadReq (ifetch) 452f98 hit
^C
Program received signal SIGINT, Interrupt.
0x00000000004e0f90 in
std::__fill_n_a<__gnu_cxx::_Hashtable_node<std::pair<unsigned long
const, X86ISA::TlbEntry> >**, unsigned long,
__gnu_cxx::_Hashtable_node<std::pair<unsigned long const,
X86ISA::TlbEntry> >*> (__first=0x7fff70017000, __n=4065295,
__value=@0x7fffffffb8d0)
    at /usr/include/c++/4.4/bits/stl_algobase.h:758
758             *__first = __tmp;
(gdb) ^CQuit
(gdb)



On 4/27/12, Steve Reinhardt <[email protected]> wrote:

Perhaps you could fire off the run under gdb, and use the --debug-break
flag to drop in to gdb at the tick where it seems to stop running.  If
the
simulation stops and memory blows up, it's almost like you're stuck in
some
subtle infinite loop with a memory allocation in it.  (You might have to
continue just a little past there and hit ctrl-c before it dies to catch
it
in the middle of this loop.)

On Fri, Apr 27, 2012 at 11:29 AM, Mahmood Naderan
<[email protected]>wrote:

i searched for something similar (stoping the simulation when it reach
at a specific memory usage to prevent killing) but didn't find such
thing. Do you know?

I also attached gdb. it doesn't show anything useful because lastly it
get killed.

On 4/27/12, Gabe Black <[email protected]> wrote:

Valgrind should tell you where the leaked memory was allocated. You
may
have to give it a command line option for that, or stop it before it
gets itself killed.

Gabe

On 04/27/12 11:10, Steve Reinhardt wrote:

Can you attach gdb when it does this, see where it's at, and maybe
step through the code a bit to see what it's doing?

On Fri, Apr 27, 2012 at 10:54 AM, Mahmood Naderan
<[email protected] <mailto:[email protected]>> wrote:

    That was a guess. As I said, i turned on the debugger to see
when
it
    start eating the memory. As you can see the last messageit print
is:
    339069000: system.cpu + A0 T0 : 0x852f93.0  :   MOV_R_I : limm

eax,

    0x9 : IntAlu :  D=0x0000000000000009
    339069500: system.cpu.icache: set be: moving blk 452f80 to MRU
    339069500: system.cpu.icache: ReadReq (ifetch) 452f98 hit

    Then no message is printed and I see, with top command, that the
    memory usage gos up and up until it consumes all memory.


    On 4/27/12, Nilay Vaish <[email protected]
    <mailto:[email protected]>> wrote:
   > How do you know the instruction at which the memory starts
    leaking? What
   > should we conclude from the instruction trace in your mail. I
am
    unable to
   > arrive at any conclusion from the valgrind report that you had
    attached.
   > Apart from the info on uninitialized values, I did not find
any
    useful
   > output produced by valgrind.
   >
   > --
   > Nilay
   >
   > On Fri, 27 Apr 2012, Mahmood Naderan wrote:
   >
   >> tonto with the test input uses about 4 GB and runs for about
2
    seconds
   >> on a real machine.
   >>
   >> I also used the test input with gem5. However again after
tick
   >> 300000000, all the 30GB memory is used and then gem5 is
killed.
The
   >> same behaviour with ref input...
   >>
   >> I ran the following command:
   >> valgrind --tool=memcheck --leak-check=full
--track-origins=yes
   >> --suppressions=../util/valgrind-suppressions

../build/X86/m5.debug

   >> --debug-flags=Cache,ExecAll,Bus,CacheRepl,Context
   >> --trace-start=339050000 ../configs/example/se.py -c
   >> tonto_base.amd64-m64-gcc44-nn --cpu-type=detailed -F 5000000
    --maxtick
   >> 10000000 --caches --l2cache --prog-interval=100000
   >>
   >>
   >> I also attach the report again. At the instruction that the

memory

   >> leak begins, you can see:
   >> ...
   >> 339066000: system.cpu + A0 T0 : 0x83d48d    : call   0x15afe
   >> 339066000: system.cpu + A0 T0 : 0x83d48d.0  :   CALL_NEAR_I :

limm

   >> t1, 0x15afe : IntAlu :  D=0x0000000000015afe
   >> 339066500: system.cpu + A0 T0 : 0x83d48d.1  :   CALL_NEAR_I :

rdip

   >> t7, %ctrl153,  : IntAlu :  D=0x000000000083d492
   >> 339067000: system.cpu.dcache: set 9a: moving blk 5aa680 to
MRU
   >> 339067000: system.cpu.dcache: WriteReq 5aa6b8 hit
   >> 339067000: system.cpu + A0 T0 : 0x83d48d.2  :   CALL_NEAR_I :
    st   t7,
   >> SS:[rsp + 0xfffffffffffffff8] : MemWrite :
D=0x000000000083d492
   >> A=0x7fffffffe6b8
   >> 339067500: system.cpu + A0 T0 : 0x83d48d.3  :   CALL_NEAR_I :

subi

   >> rsp, rsp, 0x8 : IntAlu :  D=0x00007fffffffe6b8
   >> 339068000: system.cpu + A0 T0 : 0x83d48d.4  :   CALL_NEAR_I :
    wrip   ,
   >> t7, t1 : IntAlu :
   >> 339068500: system.cpu.icache: set be: moving blk 452f80 to
MRU
   >> 339068500: system.cpu.icache: ReadReq (ifetch) 452f90 hit
   >> 339068500: system.cpu + A0 T0 : 0x852f90    : mov    r10, rcx
   >> 339068500: system.cpu + A0 T0 : 0x852f90.0  :   MOV_R_R : mov
    r10,
   >> r10, rcx : IntAlu :  D=0x0000000000000022
   >> 339069000: system.cpu.icache: set be: moving blk 452f80 to
MRU
   >> 339069000: system.cpu.icache: ReadReq (ifetch) 452f90 hit
   >> 339069000: system.cpu + A0 T0 : 0x852f93    : mov    eax, 0x9
   >> 339069000: system.cpu + A0 T0 : 0x852f93.0  :   MOV_R_I :
limm
      eax,
   >> 0x9 : IntAlu :  D=0x0000000000000009
   >> 339069500: system.cpu.icache: set be: moving blk 452f80 to
MRU
   >> 339069500: system.cpu.icache: ReadReq (ifetch) 452f98 hit
   >>
   >>
   >> What is your opinion then?
   >> Regards,
   >>
   >> On 4/27/12, Steve Reinhardt <[email protected]
    <mailto:[email protected]>> wrote:
   >>> Also, if you do run valgrind, use the
    util/valgrind-suppressions file to
   >>> suppress spurious reports.  Read the valgrind docs to see
how
this
   >>> works.
   >>>
   >>> Steve
   >>>
   > _______________________________________________
   > gem5-users mailing list
   > [email protected] <mailto:[email protected]>
   > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
   >


    --
    // Naderan *Mahmood;
    _______________________________________________
    gem5-users mailing list
    [email protected] <mailto:[email protected]>
    http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users



_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users



--
// Naderan *Mahmood;
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users



--
// Naderan *Mahmood;



--
// Naderan *Mahmood;



--
// Naderan *Mahmood;
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] problem with benchmarks which uses '<'

Reply via email to