update:
While debugging the last message I posted earlier, I found that when
it reaches the last statement in
void
Process::allocateMem(Addr vaddr, int64_t size, bool clobber)
{
int npages = divCeil(size, (int64_t)VMPageSize);
Addr paddr = system->allocPhysPages(npages);
pTable->map(vaddr, paddr, size, clobber);
}
the execution won't get out of allocateMem().
The values are:
VMPageSize=4096
npages=31772235
paddr=0x5d2000
vaddr=0x2aaaaaaad000
size=5548434871059525632
clobber=false
I think the suspect is size=5548434871059525632
'size' gets the value of 'length' in the caller function mmapFunc()
uint64_t length = p->getSyscallArg(tc, index);
At this point:
tc=0x3256ae0
index=2
Can someone explain what is going on? why the 'length' is so much large?
On 4/28/12, Mahmood Naderan <[email protected]> wrote:
> SyscallVerbose shows this:
> 300000000: Event_36: system.cpu progress event, total
> committed:535958, progress insts committed: 89710, IPC: 0.8971
> 336777500: system.cpu: syscall time called w/arguments
> 15,0,0,140737488349664
> 336777500: system.cpu: syscall time returns 1000000000
> 336795000: system.cpu: syscall time called w/arguments 15,0,0,0
> 336795000: system.cpu: syscall time returns 1000000000
> 337615500: system.cpu: syscall open called w/arguments 2,438,0,9618232
> 337615500: global: opening file /etc/localtime
> 337615500: system.cpu: syscall open returns 5
> 337748000: system.cpu: syscall fstat called w/arguments
> 2,140737488349184,140737488349184,5
> 337748000: system.cpu: syscall fstat returns 0
> 337857500: system.cpu: syscall fstat called w/arguments
> 2,140737488348816,140737488348816,5
> 337857500: system.cpu: syscall fstat returns 0
> 337883500: system.cpu: syscall mmap called w/arguments 34,3,8192,0
> 337883500: system.cpu: syscall mmap returns 46912496119808
> 338041000: system.cpu: syscall read called w/arguments
> 34,8192,46912496119808,5
> 338041000: system.cpu: syscall read returns 1622
> 338303500: system.cpu: syscall lseek called w/arguments
> 100663296,1,18446744073709550604,5
> 338303500: system.cpu: syscall lseek returns 610
> 338483500: system.cpu: syscall read called w/arguments
> 100663296,8192,46912496119808,5
> 338483500: system.cpu: syscall read returns 1012
> 339055500: system.cpu: syscall mmap called w/arguments
> 34,3,5548434871059525632,0
> *** Exited on signal SIGKILL ***
>
>
> On 4/28/12, Steve Reinhardt <[email protected]> wrote:
>> You could try running with the SyscallVerbose debug flag and/or setting a
>> breakpoint in mmapFunc<X86Linux64> to see if/why the application seems to
>> be requesting such a ridiculously large amount of memory. It could be
>> something like trying to read a 64-bit value where only a 32-bit value is
>> being provided.
>>
>> Make sure you use gem5.debug for this if you're not already.
>>
>> Steve
>>
>> On Sat, Apr 28, 2012 at 9:10 AM, Nilay Vaish <[email protected]> wrote:
>>
>>> Check why the size of the memory being asked for is so high? It might be
>>> that there is some problem with the way 'length' is found in mmapFunc().
>>> Or
>>> it could be that the application did request a huge chunk of memory to
>>> be
>>> mmapped.
>>>
>>> --
>>> Nilay
>>>
>>>
>>> On Sat, 28 Apr 2012, Mahmood Naderan wrote:
>>>
>>> Isn't there any idea about my finding?
>>>>
>>>>
>>>> On 4/27/12, Mahmood Naderan <[email protected]> wrote:
>>>>
>>>>> ok I think I find the bug. I used "continue" and "ctrl+c" multiple
>>>>> times to see if it stuck at a particular function. The backtrace
>>>>> shows:
>>>>>
>>>>>
>>>>> #0 0x00000000004dfee7 in __gnu_cxx::hashtable<std::**pair<unsigned
>>>>> long
>>>>> const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned
>>>>> long>, std::_Select1st<std::pair<**unsigned long const,
>>>>> X86ISA::TlbEntry> >, std::equal_to<unsigned long>,
>>>>> std::allocator<X86ISA::**TlbEntry> >::_M_bkt_num_key (this=0x28b8970,
>>>>> __key=@0x21d9a7c8, __n=50331653) at
>>>>> /usr/include/c++/4.4/backward/**hashtable.h:590
>>>>> #1 0x00000000004dfff9 in __gnu_cxx::hashtable<std::**pair<unsigned
>>>>> long
>>>>> const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned
>>>>> long>, std::_Select1st<std::pair<**unsigned long const,
>>>>> X86ISA::TlbEntry> >, std::equal_to<unsigned long>,
>>>>> std::allocator<X86ISA::**TlbEntry> >::_M_bkt_num (this=0x28b8970,
>>>>> __obj=..., __n=50331653) at
>>>>> /usr/include/c++/4.4/backward/**hashtable.h:594
>>>>> #2 0x00000000004df9c8 in __gnu_cxx::hashtable<std::**pair<unsigned
>>>>> long
>>>>> const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned
>>>>> long>, std::_Select1st<std::pair<**unsigned long const,
>>>>> X86ISA::TlbEntry> >, std::equal_to<unsigned long>,
>>>>> std::allocator<X86ISA::**TlbEntry> >::resize (this=0x28b8970,
>>>>> __num_elements_hint=25165844) at
>>>>> /usr/include/c++/4.4/backward/**hashtable.h:1001
>>>>> #3 0x00000000004df100 in __gnu_cxx::hashtable<std::**pair<unsigned
>>>>> long
>>>>> const, X86ISA::TlbEntry>, unsigned long, __gnu_cxx::hash<unsigned
>>>>> long>, std::_Select1st<std::pair<**unsigned long const,
>>>>> X86ISA::TlbEntry> >, std::equal_to<unsigned long>,
>>>>> std::allocator<X86ISA::**TlbEntry> >::find_or_insert (this=0x28b8970,
>>>>> __obj=...) at /usr/include/c++/4.4/backward/**hashtable.h:789
>>>>> #4 0x00000000004deaca in __gnu_cxx::hash_map<unsigned long,
>>>>> X86ISA::TlbEntry, __gnu_cxx::hash<unsigned long>,
>>>>> std::equal_to<unsigned long>, std::allocator<X86ISA::**TlbEntry>
>>>>>
>>>>>> ::operator[] (this=0x28b8970,
>>>>>>
>>>>> __key=@0x7fffffffba80) at /usr/include/c++/4.4/ext/hash_**map:217
>>>>> #5 0x00000000004daa68 in PageTable::map (this=0x28b8970,
>>>>> vaddr=47015569313792, paddr=103079288832,
>>>>> size=5548434767986339840, clobber=false) at
>>>>> build/X86/mem/page_table.cc:82
>>>>> #6 0x000000000074b9c8 in Process::allocateMem (this=0x30be640,
>>>>> vaddr=46912496128000,
>>>>> size=5548434871059525632, clobber=false) at
>>>>> build/X86/sim/process.cc:332
>>>>> #7 0x00000000007aba21 in mmapFunc<X86Linux64> (desc=0x2052fb8, num=9,
>>>>> p=0x30be640, tc=0x3331210)
>>>>> at build/X86/sim/syscall_emul.hh:**1069
>>>>> #8 0x000000000073ca11 in SyscallDesc::doSyscall (this=0x2052fb8,
>>>>> callnum=9, process=0x30be640,
>>>>> tc=0x3331210) at build/X86/sim/syscall_emul.cc:**69
>>>>> #9 0x00000000007516a0 in LiveProcess::syscall (this=0x30be640,
>>>>> callnum=9, tc=0x3331210)
>>>>> at build/X86/sim/process.cc:590
>>>>> #10 0x0000000000c10ce3 in SimpleThread::syscall (this=0x33305d0,
>>>>> callnum=9)
>>>>> at build/X86/cpu/simple_thread.**hh:384
>>>>>
>>>>>
>>>>>
>>>>> As you can see there is a problem with mmapFunc<X86Linux64> syscall
>>>>> which allocate memory through Process::allocateMem
>>>>> That is my understanding....
>>>>>
>>>>>
>>>>>
>>>>> On 4/27/12, Mahmood Naderan <[email protected]> wrote:
>>>>>
>>>>>> Is this useful?
>>>>>>
>>>>>> 339051500: system.cpu + A0 T0 : 0x83d48d.4 : CALL_NEAR_I : wrip
>>>>>> ,
>>>>>> t7, t1 : IntAlu :
>>>>>> 339052000: system.cpu.icache: ReadReq (ifetch) 452f90 hit
>>>>>> 339052000: system.cpu + A0 T0 : 0x852f90 : mov r10, rcx
>>>>>> 339052000: system.cpu + A0 T0 : 0x852f90.0 : MOV_R_R : mov r10,
>>>>>> r10, rcx : IntAlu : D=0x0000000000000022
>>>>>> 339052500: system.cpu.icache: ReadReq (ifetch) 452f90 hit
>>>>>> 339052500: system.cpu + A0 T0 : 0x852f93 : mov eax, 0x9
>>>>>> 339052500: system.cpu + A0 T0 : 0x852f93.0 : MOV_R_I : limm eax,
>>>>>> 0x9 : IntAlu : D=0x0000000000000009
>>>>>> 339053000: system.cpu.icache: ReadReq (ifetch) 452f98 hit
>>>>>> ^C
>>>>>> Program received signal SIGINT, Interrupt.
>>>>>> 0x00000000004e0f90 in
>>>>>> std::__fill_n_a<__gnu_cxx::_**Hashtable_node<std::pair<**unsigned
>>>>>> long
>>>>>> const, X86ISA::TlbEntry> >**, unsigned long,
>>>>>> __gnu_cxx::_Hashtable_node<**std::pair<unsigned long const,
>>>>>> X86ISA::TlbEntry> >*> (__first=0x7fff70017000, __n=4065295,
>>>>>> __value=@0x7fffffffb8d0)
>>>>>> at /usr/include/c++/4.4/bits/stl_**algobase.h:758
>>>>>> 758 *__first = __tmp;
>>>>>> (gdb) ^CQuit
>>>>>> (gdb)
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 4/27/12, Steve Reinhardt <[email protected]> wrote:
>>>>>>
>>>>>>> Perhaps you could fire off the run under gdb, and use the
>>>>>>> --debug-break
>>>>>>> flag to drop in to gdb at the tick where it seems to stop running.
>>>>>>> If
>>>>>>> the
>>>>>>> simulation stops and memory blows up, it's almost like you're stuck
>>>>>>> in
>>>>>>> some
>>>>>>> subtle infinite loop with a memory allocation in it. (You might
>>>>>>> have
>>>>>>> to
>>>>>>> continue just a little past there and hit ctrl-c before it dies to
>>>>>>> catch
>>>>>>> it
>>>>>>> in the middle of this loop.)
>>>>>>>
>>>>>>> On Fri, Apr 27, 2012 at 11:29 AM, Mahmood Naderan
>>>>>>> <[email protected]>wrote:
>>>>>>>
>>>>>>> i searched for something similar (stoping the simulation when it
>>>>>>> reach
>>>>>>>> at a specific memory usage to prevent killing) but didn't find such
>>>>>>>> thing. Do you know?
>>>>>>>>
>>>>>>>> I also attached gdb. it doesn't show anything useful because lastly
>>>>>>>> it
>>>>>>>> get killed.
>>>>>>>>
>>>>>>>> On 4/27/12, Gabe Black <[email protected]> wrote:
>>>>>>>>
>>>>>>>>> Valgrind should tell you where the leaked memory was allocated.
>>>>>>>>> You
>>>>>>>>> may
>>>>>>>>> have to give it a command line option for that, or stop it before
>>>>>>>>> it
>>>>>>>>> gets itself killed.
>>>>>>>>>
>>>>>>>>> Gabe
>>>>>>>>>
>>>>>>>>> On 04/27/12 11:10, Steve Reinhardt wrote:
>>>>>>>>>
>>>>>>>>>> Can you attach gdb when it does this, see where it's at, and
>>>>>>>>>> maybe
>>>>>>>>>> step through the code a bit to see what it's doing?
>>>>>>>>>>
>>>>>>>>>> On Fri, Apr 27, 2012 at 10:54 AM, Mahmood Naderan
>>>>>>>>>> <[email protected] <mailto:[email protected]>> wrote:
>>>>>>>>>>
>>>>>>>>>> That was a guess. As I said, i turned on the debugger to see
>>>>>>>>>> when
>>>>>>>>>> it
>>>>>>>>>> start eating the memory. As you can see the last messageit
>>>>>>>>>> print
>>>>>>>>>> is:
>>>>>>>>>> 339069000: system.cpu + A0 T0 : 0x852f93.0 : MOV_R_I : limm
>>>>>>>>>>
>>>>>>>>> eax,
>>>>>>>>
>>>>>>>>> 0x9 : IntAlu : D=0x0000000000000009
>>>>>>>>>> 339069500: system.cpu.icache: set be: moving blk 452f80 to MRU
>>>>>>>>>> 339069500: system.cpu.icache: ReadReq (ifetch) 452f98 hit
>>>>>>>>>>
>>>>>>>>>> Then no message is printed and I see, with top command, that
>>>>>>>>>> the
>>>>>>>>>> memory usage gos up and up until it consumes all memory.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 4/27/12, Nilay Vaish <[email protected]
>>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>> > How do you know the instruction at which the memory starts
>>>>>>>>>> leaking? What
>>>>>>>>>> > should we conclude from the instruction trace in your mail. I
>>>>>>>>>> am
>>>>>>>>>> unable to
>>>>>>>>>> > arrive at any conclusion from the valgrind report that you
>>>>>>>>>> had
>>>>>>>>>> attached.
>>>>>>>>>> > Apart from the info on uninitialized values, I did not find
>>>>>>>>>> any
>>>>>>>>>> useful
>>>>>>>>>> > output produced by valgrind.
>>>>>>>>>> >
>>>>>>>>>> > --
>>>>>>>>>> > Nilay
>>>>>>>>>> >
>>>>>>>>>> > On Fri, 27 Apr 2012, Mahmood Naderan wrote:
>>>>>>>>>> >
>>>>>>>>>> >> tonto with the test input uses about 4 GB and runs for about
>>>>>>>>>> 2
>>>>>>>>>> seconds
>>>>>>>>>> >> on a real machine.
>>>>>>>>>> >>
>>>>>>>>>> >> I also used the test input with gem5. However again after
>>>>>>>>>> tick
>>>>>>>>>> >> 300000000, all the 30GB memory is used and then gem5 is
>>>>>>>>>> killed.
>>>>>>>>>> The
>>>>>>>>>> >> same behaviour with ref input...
>>>>>>>>>> >>
>>>>>>>>>> >> I ran the following command:
>>>>>>>>>> >> valgrind --tool=memcheck --leak-check=full
>>>>>>>>>> --track-origins=yes
>>>>>>>>>> >> --suppressions=../util/**valgrind-suppressions
>>>>>>>>>>
>>>>>>>>> ../build/X86/m5.debug
>>>>>>>>
>>>>>>>>> >> --debug-flags=Cache,ExecAll,**Bus,CacheRepl,Context
>>>>>>>>>> >> --trace-start=339050000 ../configs/example/se.py -c
>>>>>>>>>> >> tonto_base.amd64-m64-gcc44-nn --cpu-type=detailed -F 5000000
>>>>>>>>>> --maxtick
>>>>>>>>>> >> 10000000 --caches --l2cache --prog-interval=100000
>>>>>>>>>> >>
>>>>>>>>>> >>
>>>>>>>>>> >> I also attach the report again. At the instruction that the
>>>>>>>>>>
>>>>>>>>> memory
>>>>>>>>
>>>>>>>>> >> leak begins, you can see:
>>>>>>>>>> >> ...
>>>>>>>>>> >> 339066000: system.cpu + A0 T0 : 0x83d48d : call 0x15afe
>>>>>>>>>> >> 339066000: system.cpu + A0 T0 : 0x83d48d.0 : CALL_NEAR_I
>>>>>>>>>> :
>>>>>>>>>>
>>>>>>>>> limm
>>>>>>>>
>>>>>>>>> >> t1, 0x15afe : IntAlu : D=0x0000000000015afe
>>>>>>>>>> >> 339066500: system.cpu + A0 T0 : 0x83d48d.1 : CALL_NEAR_I
>>>>>>>>>> :
>>>>>>>>>>
>>>>>>>>> rdip
>>>>>>>>
>>>>>>>>> >> t7, %ctrl153, : IntAlu : D=0x000000000083d492
>>>>>>>>>> >> 339067000: system.cpu.dcache: set 9a: moving blk 5aa680 to
>>>>>>>>>> MRU
>>>>>>>>>> >> 339067000: system.cpu.dcache: WriteReq 5aa6b8 hit
>>>>>>>>>> >> 339067000: system.cpu + A0 T0 : 0x83d48d.2 : CALL_NEAR_I
>>>>>>>>>> :
>>>>>>>>>> st t7,
>>>>>>>>>> >> SS:[rsp + 0xfffffffffffffff8] : MemWrite :
>>>>>>>>>> D=0x000000000083d492
>>>>>>>>>> >> A=0x7fffffffe6b8
>>>>>>>>>> >> 339067500: system.cpu + A0 T0 : 0x83d48d.3 : CALL_NEAR_I
>>>>>>>>>> :
>>>>>>>>>>
>>>>>>>>> subi
>>>>>>>>
>>>>>>>>> >> rsp, rsp, 0x8 : IntAlu : D=0x00007fffffffe6b8
>>>>>>>>>> >> 339068000: system.cpu + A0 T0 : 0x83d48d.4 : CALL_NEAR_I
>>>>>>>>>> :
>>>>>>>>>> wrip ,
>>>>>>>>>> >> t7, t1 : IntAlu :
>>>>>>>>>> >> 339068500: system.cpu.icache: set be: moving blk 452f80 to
>>>>>>>>>> MRU
>>>>>>>>>> >> 339068500: system.cpu.icache: ReadReq (ifetch) 452f90 hit
>>>>>>>>>> >> 339068500: system.cpu + A0 T0 : 0x852f90 : mov r10,
>>>>>>>>>> rcx
>>>>>>>>>> >> 339068500: system.cpu + A0 T0 : 0x852f90.0 : MOV_R_R :
>>>>>>>>>> mov
>>>>>>>>>> r10,
>>>>>>>>>> >> r10, rcx : IntAlu : D=0x0000000000000022
>>>>>>>>>> >> 339069000: system.cpu.icache: set be: moving blk 452f80 to
>>>>>>>>>> MRU
>>>>>>>>>> >> 339069000: system.cpu.icache: ReadReq (ifetch) 452f90 hit
>>>>>>>>>> >> 339069000: system.cpu + A0 T0 : 0x852f93 : mov eax,
>>>>>>>>>> 0x9
>>>>>>>>>> >> 339069000: system.cpu + A0 T0 : 0x852f93.0 : MOV_R_I :
>>>>>>>>>> limm
>>>>>>>>>> eax,
>>>>>>>>>> >> 0x9 : IntAlu : D=0x0000000000000009
>>>>>>>>>> >> 339069500: system.cpu.icache: set be: moving blk 452f80 to
>>>>>>>>>> MRU
>>>>>>>>>> >> 339069500: system.cpu.icache: ReadReq (ifetch) 452f98 hit
>>>>>>>>>> >>
>>>>>>>>>> >>
>>>>>>>>>> >> What is your opinion then?
>>>>>>>>>> >> Regards,
>>>>>>>>>> >>
>>>>>>>>>> >> On 4/27/12, Steve Reinhardt <[email protected]
>>>>>>>>>> <mailto:[email protected]>> wrote:
>>>>>>>>>> >>> Also, if you do run valgrind, use the
>>>>>>>>>> util/valgrind-suppressions file to
>>>>>>>>>> >>> suppress spurious reports. Read the valgrind docs to see
>>>>>>>>>> how
>>>>>>>>>> this
>>>>>>>>>> >>> works.
>>>>>>>>>> >>>
>>>>>>>>>> >>> Steve
>>>>>>>>>> >>>
>>>>>>>>>> > ______________________________**_________________
>>>>>>>>>> > gem5-users mailing list
>>>>>>>>>> > [email protected] <mailto:[email protected]>
>>>>>>>>>> >
>>>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users>
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> // Naderan *Mahmood;
>>>>>>>>>> ______________________________**_________________
>>>>>>>>>> gem5-users mailing list
>>>>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>>>>>
>>>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> ______________________________**_________________
>>>>>>>>>> gem5-users mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> // Naderan *Mahmood;
>>>>>>>> ______________________________**_________________
>>>>>>>> gem5-users mailing list
>>>>>>>> [email protected]
>>>>>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> // Naderan *Mahmood;
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> // Naderan *Mahmood;
>>>>>
>>>>>
>>>>
>>>> --
>>>> // Naderan *Mahmood;
>>>> ______________________________**_________________
>>>> gem5-users mailing list
>>>> [email protected]
>>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users>
>>>>
>>>> ______________________________**_________________
>>> gem5-users mailing list
>>> [email protected]
>>> http://m5sim.org/cgi-bin/**mailman/listinfo/gem5-users<http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users>
>>>
>>
>
>
> --
> // Naderan *Mahmood;
>
--
// Naderan *Mahmood;
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users