On Sun, Oct 30, 2011 at 10:22 AM, Aziz <[email protected]> wrote:
> Hello marss users and developers,
>
> Added is a simple python script to match instructions in
> objdump_<benchmark> (e.g. objdumpd_bzip2) file, which is output of "objdump
> -d <executable>" (e.g. objdump -d bzip2), with ptl_rip_trace file. It is
> more or less specific to what I needed, but it can be changed according to
> what you need to get. What it does is parsing and searching lines.
>
> Thanks Aziz for sharing the script. If you don't mind, I was thinking to
add this script to 'trace_to_func.py' script so we have one script that
supports multiple options.
> By the way, do you know the variable for the hash table, and micro-op
> buffers?
>
> All the micro-opcodes are stored into 'bbcache' array. Its an array where
each cache is assigned to each cpu-context. A BasicBlockCache is a hash
table that is hashed with the RIP address. For example, take a look at code
in 'decode-core.cpp' file. To get the dump of micro-code, you can modify
'ptl_flush_bbcache' function in 'basecore.cpp' file to dump all bbcache
entries before flushing them and also dump them before simulation ends.
- Avadh
#=============================================
> import time
> import string
> import os
> import sys
>
>
> process_name = "bzip2"
>
>
> def list_diff_insns(process_name):
> list_insns = []
> fileobject = open ("objdumpd_"+process_name, 'r')
> fileobject.readline()
> fileobject.readline() #skip first 2 lines
> while 1:
> line = fileobject.readline()
> if not line:
> break
> line = string.strip(line)
> if not line:
> continue
> #if last character is ":" skip the line
> if line[-1] == ":":
> continue
> #print "LINE IS: ", line
> set1 = string.split(line, "\t") # 40bf2f: 48 8d 7c 24 0c lea
> 0xc(%rsp),%rdi
> if len(set1) < 3:
> continue
> insn = set1[2] # lea 0xc(%rsp),%rdi
> set2 = string.split(insn)
> if set2[0] not in list_insns: #if lea is not in list_insns
> list_insns.append(set2[0])
> #print list_insns
> return list_insns
>
>
>
> def list_insns_committed(process_name):
> list_insns = list_diff_insns(process_name)
> num_insns = [0]*len(list_insns) #create a list of zeros of the same size
> as # of different instructions
> fileptlriptrace = open ("ptl_rip_trace", 'r')
> fileobjdumpd = open ("objdumpd_"+process_name, 'r')
> start_pos_objdumpd = fileobjdumpd.tell()
> line_num = 0
> #print "\n\n\nBU 1\n\n\n"
> while 1:
> line = fileptlriptrace.readline()
> fileobjdumpd.seek(start_pos_objdumpd)
> line_num += 1
> if not line:
> break
> line = string.strip(line)
> if not line:
> continue
> stripped_line = string.split(line)
> #print stripped_line[-1], len(hex(int(stripped_line[1],16)))
> if (stripped_line[-1] == "0" and len(hex(int(stripped_line[1],16))) <=
> 10 ):
> #print line_num, " ", stripped_line
> address1 = int(stripped_line[1],16)
> #skip first 2 lines
> fileobjdumpd.readline()
> fileobjdumpd.readline()
> line2 = fileobjdumpd.readline()
> line2_not_read = 0
> while 1:
> add3set = 0
> if line2_not_read == 1:
> line2 = fileobjdumpd.readline()
> line2_not_read = 0
> #print "LINE2 IS: ",line2
> if not line2:
> #print "not line2"
> break
> line2 = string.strip(line2)
> if not line2:
> #print "not line2 after strip"
> line2_not_read = 1
> continue
> if line2[-1] == ":":
> #print "line[-1]=\":\""
> line2_not_read = 1
> continue
> line2_next = fileobjdumpd.readline()
> if not line2_next:
> address3 = 99999999999999999999
> add3set = 1
> elif not string.strip(line2_next):
> continue
> else:
> line2_next = string.strip(line2_next)
> if line2_next[-1] == ":" or line2_next == "...":
> continue
> line2_next = string.strip(line2_next)
> #print "LINE2_NEXT IS:", line2_next
> set_objdump = string.split(line2, ":")
> set_objdump_next = string.split(line2_next, ":")
> address2 = int(set_objdump[0],16)
> if add3set == 0:
> address3 = int(set_objdump_next[0],16)
> #print
> "address1:",address1,"\naddress2:",address2,"\naddress3:",address3,"\n\n"
> if address1 > address2 and address1 < address3:
> print set_objdump[1]
> set3 = string.split(set_objdump[1], "\t")
> if len(set3) < 3:
> break
> set4 = string.split(set3[2])
> instruction = set4[0]
> print "INNNN ", instruction
> num_insns[list_insns.index(instruction)]+=1
> break
> line2 = line2_next
> for i in range(len(list_insns)):
> print list_insns[i], " :\t", num_insns[i]
> #list_diff_insns(process_name)
> list_insns_committed(process_name)
> #=============================================
>
> Thanks,
> Aziz Eker
>
>
>
> On Sat, Oct 29, 2011 at 10:54 PM, Aziz <[email protected]> wrote:
>
>> You said that simulator translate the instructions to micro-ops and keep
>> a hash of RIP to micro-op buffers. I could not really find, what is the
>> variable for those buffers to print, and the hash table?
>>
>> I have not completed the script yet, but I will share it as soon as it is
>> done.
>>
>> Best,
>> Aziz
>>
>> On Sat, Oct 29, 2011 at 6:36 PM, avadh patel <[email protected]> wrote:
>>
>>>
>>> On Sat, Oct 29, 2011 at 8:21 AM, Aziz <[email protected]> wrote:
>>>
>>>> Thank you for this information. It was really helpful and I finally
>>>> have been able to obtain the trace information.
>>>> I also want to get the micro operations trace. Since it is kept in the
>>>> buffers, is there an easy way to obtain it? Or, should I add into the code
>>>> to get it?
>>>>
>>>> These buffers are flushed on each context switch, so its better to add
>>> a code that will print them.
>>> Also it will be helpful to others if you can share the changes you made
>>> to script for getting the instruction trace.
>>>
>>> - Avadh
>>>
>>>
>>>> Thanks,
>>>> Aziz
>>>>
>>>>
>>>>
>>>> On Fri, Oct 28, 2011 at 1:31 AM, avadh patel <[email protected]>wrote:
>>>>
>>>>> The script depends on the file you give in via '-o' option. That file
>>>>> contains the function names and start address. And the trace rip contains
>>>>> the addresses that are committed. The script simply maps the address to
>>>>> the function and print out the function name along side the trace address.
>>>>> You'll need to modify the script to take input the output of 'objdump -d'
>>>>> (which has all instructions and addresses) and map the trace address in it
>>>>> to get the x86 instruction.
>>>>>
>>>>> - Avadh
>>>>>
>>>>>
>>>>> On Wed, Oct 26, 2011 at 7:45 AM, Aziz <[email protected]> wrote:
>>>>>
>>>>>> Thank you for your help and the script. Finally I've been able to get
>>>>>> the functions.
>>>>>>
>>>>>> Could you please give me some pointers on how to modify the script to
>>>>>> give me the instruction trace?
>>>>>>
>>>>>> Thanks,
>>>>>> Aziz
>>>>>>
>>>>>> On Wed, Oct 26, 2011 at 12:28 AM, Furat Afram
>>>>>> <[email protected]>wrote:
>>>>>>
>>>>>>> try ./trace_to_func.py ptl_rip_trace output.txt -o ojectfile
>>>>>>>
>>>>>>> ojectfile is the output of objdump -t
>>>>>>> I think this will give you the functions not the instructions but it
>>>>>>> shouldn't be hard to modify it to give you the instruction opcodes
>>>>>>> -Furat
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Oct 25, 2011 at 1:54 PM, Aziz <[email protected]> wrote:
>>>>>>> > Thanks for the response. I've been trying hard to get to somewhere
>>>>>>> for
>>>>>>> > obtaining the instruction trace, but no luck.
>>>>>>> >
>>>>>>> >>>
>>>>>>> >>> I need to obtain instruction trace for the simulation run. I
>>>>>>> checked the
>>>>>>> >>> email archive, but defining TRACE_RIP only gives me hex coded
>>>>>>> instructions,
>>>>>>> >>> where I need the instruction, registers, and memory addresses as
>>>>>>> in "add
>>>>>>> >>> eax, 0xf4". Is there any way to obtain this?
>>>>>>> >>
>>>>>>> >> Its little tricky because the simulator translate the
>>>>>>> instructions to
>>>>>>> >> micro-ops and keep a hash of RIP to micro-op buffers. So once
>>>>>>> instruction is
>>>>>>> >> decoded into micro-ops, we don't keep track of original
>>>>>>> instruction. In
>>>>>>> >> order to create a trace file, you'll need to add a new hash-table
>>>>>>> that keeps
>>>>>>> >> track of RIP address to its original instruction. Then you can
>>>>>>> use that in
>>>>>>> >> pipeline to dump the trace along with register values and memory
>>>>>>> addresses.
>>>>>>> >
>>>>>>> > I tried to get into the code. I found that qemu works on the
>>>>>>> instructions in
>>>>>>> > disas_insn() function (at qemu/target-i386/translate.c) , but
>>>>>>> marss transfer
>>>>>>> > the control to ptlsim using gen_helper_switch_to_sim(). I did not
>>>>>>> understand
>>>>>>> > though, what gen_jmp_im(pc_start - s->cs_base) does (line 4080
>>>>>>> > in qemu/target-i386/translate.c).
>>>>>>> > Then I though, why use ptlsim, I can just get the instructions
>>>>>>> from qemu.
>>>>>>> > When I searched for it on the web, I found this document
>>>>>>> >
>>>>>>> http://www.iamroot.org/xe/?module=file&act=procFileDownload&file_srl=37296&sid=1cb6b46c0111f9909279b58df123efa6
>>>>>>> > which explains how to trace instructions using qemu. I tried the
>>>>>>> method they
>>>>>>> > gave within the "Trace instructions in full system emulation"
>>>>>>> section, but
>>>>>>> > somehow I could not make it work.
>>>>>>> > Then I tried using gdb debugger to singlestep through the
>>>>>>> instructions (as
>>>>>>> > explained in
>>>>>>> http://thread.gmane.org/gmane.comp.emulators.qemu/16604), but
>>>>>>> > neither gdb nor singlestep option worked for me with marss. Also
>>>>>>> when I try
>>>>>>> > "printf" in qemu files (e.g. translate.c function disas_insn), it
>>>>>>> does not
>>>>>>> > print anything.
>>>>>>> > I would appreciate if you can point me to the correct functions to
>>>>>>> change,
>>>>>>> > and where-what to print to get the trace file?
>>>>>>> > I also need to get the trace of the micro-ops in the same format I
>>>>>>> explained
>>>>>>> > (micro-op and register). Is there any automatic way to get that?
>>>>>>> If not,
>>>>>>> > what to do to acquire that kind of trace file?
>>>>>>> >>>
>>>>>>> >>> Also I could not make the trace_to_func.py file which Avadh
>>>>>>> gave. It says
>>>>>>> >>> its usage as "trace_to_func.py [options] trace_file outputfile".
>>>>>>> I
>>>>>>> >>> use ptl_rip_trace as trace_file and leave the options empty, but
>>>>>>> it always
>>>>>>> >>> gives the same Usage message.
>>>>>>> >>
>>>>>>> >> Did you specify the 'outputfile' ?
>>>>>>> >
>>>>>>> > Yes, I specified a filename for output. Still the following output
>>>>>>> comes up:
>>>>>>> >
>>>>>>> > $ ./trace_to_func.py ptl_rip_trace output.txt
>>>>>>> > Usage: trace_to_func.py [options] trace_file outputfile
>>>>>>> >
>>>>>>> > trace_to_func.py -h for help
>>>>>>> >
>>>>>>> > Thanks a lot for your help and for the great effort you put into
>>>>>>> marss.
>>>>>>> > Best,
>>>>>>> > Aziz
>>>>>>> > _______________________________________________
>>>>>>> > http://www.marss86.org
>>>>>>> > Marss86-Devel mailing list
>>>>>>> > [email protected]
>>>>>>> > https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
>>>>>>> >
>>>>>>> >
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel