On Sun, Oct 30, 2011 at 10:22 AM, Aziz <[email protected]> wrote:

> Hello marss users and developers,
>
> Added is a simple python script to match instructions in
> objdump_<benchmark> (e.g. objdumpd_bzip2) file, which is output of "objdump
> -d <executable>" (e.g. objdump -d bzip2), with ptl_rip_trace file. It is
> more or less specific to what I needed, but it can be changed according to
> what you need to get. What it does is parsing and searching lines.
>
> Thanks Aziz for sharing the script.  If you don't mind, I was thinking to
add this script to 'trace_to_func.py' script so we have one script that
supports multiple options.


> By the way, do you know the variable for the hash table, and micro-op
> buffers?
>
> All the micro-opcodes are stored into 'bbcache' array.  Its an array where
each cache is assigned to each cpu-context.  A BasicBlockCache is a hash
table that is hashed with the RIP address. For example, take a look at code
in 'decode-core.cpp' file.  To get the dump of micro-code, you can modify
'ptl_flush_bbcache' function in 'basecore.cpp' file to dump all bbcache
entries before flushing them and also dump them before simulation ends.

- Avadh

#=============================================
> import time
> import string
> import os
> import sys
>
>
> process_name = "bzip2"
>
>
> def list_diff_insns(process_name):
>   list_insns = []
>   fileobject = open ("objdumpd_"+process_name, 'r')
>   fileobject.readline()
>   fileobject.readline() #skip first 2 lines
>   while 1:
>     line = fileobject.readline()
>     if not line:
>       break
>     line = string.strip(line)
>     if not line:
>       continue
>     #if last character is ":" skip the line
>     if line[-1] == ":":
>       continue
>     #print "LINE IS: ", line
>     set1 = string.split(line, "\t") # 40bf2f: 48 8d 7c 24 0c       lea
>  0xc(%rsp),%rdi
>     if len(set1) < 3:
>       continue
>     insn = set1[2]    # lea    0xc(%rsp),%rdi
>     set2 = string.split(insn)
>     if set2[0] not in list_insns: #if lea is not in list_insns
>       list_insns.append(set2[0])
>   #print list_insns
>   return list_insns
>
>
>
> def list_insns_committed(process_name):
>   list_insns = list_diff_insns(process_name)
>   num_insns = [0]*len(list_insns) #create a list of zeros of the same size
> as # of different instructions
>   fileptlriptrace = open ("ptl_rip_trace", 'r')
>   fileobjdumpd = open ("objdumpd_"+process_name, 'r')
>   start_pos_objdumpd = fileobjdumpd.tell()
>   line_num = 0
>   #print "\n\n\nBU 1\n\n\n"
>   while 1:
>     line = fileptlriptrace.readline()
>     fileobjdumpd.seek(start_pos_objdumpd)
>     line_num += 1
>     if not line:
>       break
>     line = string.strip(line)
>     if not line:
>       continue
>     stripped_line = string.split(line)
>     #print stripped_line[-1], len(hex(int(stripped_line[1],16)))
>     if (stripped_line[-1] == "0" and len(hex(int(stripped_line[1],16))) <=
> 10 ):
>       #print line_num, " ", stripped_line
>       address1 = int(stripped_line[1],16)
>       #skip first 2 lines
>       fileobjdumpd.readline()
>       fileobjdumpd.readline()
>       line2 = fileobjdumpd.readline()
>       line2_not_read = 0
>       while 1:
>  add3set = 0
> if line2_not_read == 1:
>   line2 = fileobjdumpd.readline()
>   line2_not_read = 0
> #print "LINE2 IS: ",line2
> if not line2:
>   #print "not line2"
>   break
> line2 = string.strip(line2)
>  if not line2:
>   #print "not line2 after strip"
>   line2_not_read = 1
>   continue
> if line2[-1] == ":":
>   #print "line[-1]=\":\""
>   line2_not_read = 1
>   continue
> line2_next = fileobjdumpd.readline()
>  if not line2_next:
>   address3 = 99999999999999999999
>   add3set = 1
>  elif not string.strip(line2_next):
>   continue
> else:
>   line2_next = string.strip(line2_next)
>   if line2_next[-1] == ":" or line2_next == "...":
>     continue
> line2_next = string.strip(line2_next)
> #print "LINE2_NEXT IS:", line2_next
>  set_objdump = string.split(line2, ":")
> set_objdump_next = string.split(line2_next, ":")
>  address2 = int(set_objdump[0],16)
> if add3set == 0:
>   address3 = int(set_objdump_next[0],16)
>  #print
> "address1:",address1,"\naddress2:",address2,"\naddress3:",address3,"\n\n"
> if address1 > address2 and address1 < address3:
>   print set_objdump[1]
>   set3 = string.split(set_objdump[1], "\t")
>   if len(set3) < 3:
>     break
>   set4 = string.split(set3[2])
>   instruction = set4[0]
>   print "INNNN ", instruction
>   num_insns[list_insns.index(instruction)]+=1
>   break
> line2 = line2_next
>   for i in range(len(list_insns)):
>     print list_insns[i], " :\t", num_insns[i]
>  #list_diff_insns(process_name)
> list_insns_committed(process_name)
> #=============================================
>
> Thanks,
> Aziz Eker
>
>
>
> On Sat, Oct 29, 2011 at 10:54 PM, Aziz <[email protected]> wrote:
>
>> You said that simulator translate the instructions to micro-ops and keep
>> a hash of RIP to micro-op buffers. I could not really find, what is the
>> variable for those buffers to print, and the hash table?
>>
>> I have not completed the script yet, but I will share it as soon as it is
>> done.
>>
>> Best,
>> Aziz
>>
>> On Sat, Oct 29, 2011 at 6:36 PM, avadh patel <[email protected]> wrote:
>>
>>>
>>> On Sat, Oct 29, 2011 at 8:21 AM, Aziz <[email protected]> wrote:
>>>
>>>> Thank you for this information. It was really helpful and I finally
>>>> have been able to obtain the trace information.
>>>> I also want to get the micro operations trace. Since it is kept in the
>>>> buffers, is there an easy way to obtain it? Or, should I add into the code
>>>> to get it?
>>>>
>>>> These buffers are flushed on each context switch, so its better to add
>>> a code that will print them.
>>> Also it will be helpful to others if you can share the changes you made
>>> to script for getting the instruction trace.
>>>
>>> - Avadh
>>>
>>>
>>>> Thanks,
>>>> Aziz
>>>>
>>>>
>>>>
>>>> On Fri, Oct 28, 2011 at 1:31 AM, avadh patel <[email protected]>wrote:
>>>>
>>>>> The script depends on the file you give in via '-o' option.  That file
>>>>> contains the function names and start address.  And the trace rip contains
>>>>> the addresses that are committed.  The script simply maps the address to
>>>>> the function and print out the function name along side the trace address.
>>>>>  You'll need to modify the script to take input the output of 'objdump -d'
>>>>> (which has all instructions and addresses) and map the trace address in it
>>>>> to get the x86 instruction.
>>>>>
>>>>> - Avadh
>>>>>
>>>>>
>>>>> On Wed, Oct 26, 2011 at 7:45 AM, Aziz <[email protected]> wrote:
>>>>>
>>>>>> Thank you for your help and the script. Finally I've been able to get
>>>>>> the functions.
>>>>>>
>>>>>> Could you please give me some pointers on how to modify the script to
>>>>>> give me the instruction trace?
>>>>>>
>>>>>> Thanks,
>>>>>> Aziz
>>>>>>
>>>>>> On Wed, Oct 26, 2011 at 12:28 AM, Furat Afram 
>>>>>> <[email protected]>wrote:
>>>>>>
>>>>>>> try ./trace_to_func.py ptl_rip_trace output.txt -o ojectfile
>>>>>>>
>>>>>>> ojectfile  is the output of objdump -t
>>>>>>> I think this will give you the functions not the instructions but it
>>>>>>> shouldn't be hard to modify it to give you the instruction opcodes
>>>>>>> -Furat
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Oct 25, 2011 at 1:54 PM, Aziz <[email protected]> wrote:
>>>>>>> > Thanks for the response. I've been trying hard to get to somewhere
>>>>>>> for
>>>>>>> > obtaining the instruction trace, but no luck.
>>>>>>> >
>>>>>>> >>>
>>>>>>> >>> I need to obtain instruction trace for the simulation run. I
>>>>>>> checked the
>>>>>>> >>> email archive, but defining TRACE_RIP only gives me hex coded
>>>>>>> instructions,
>>>>>>> >>> where I need the instruction, registers, and memory addresses as
>>>>>>> in "add
>>>>>>> >>> eax, 0xf4". Is there any way to obtain this?
>>>>>>> >>
>>>>>>> >> Its little tricky because the simulator translate the
>>>>>>> instructions to
>>>>>>> >> micro-ops and keep a hash of RIP to micro-op buffers. So once
>>>>>>> instruction is
>>>>>>> >> decoded into micro-ops, we don't keep track of original
>>>>>>> instruction.  In
>>>>>>> >> order to create a trace file, you'll need to add a new hash-table
>>>>>>> that keeps
>>>>>>> >> track of RIP address to its original instruction.  Then you can
>>>>>>> use that in
>>>>>>> >> pipeline to dump the trace along with register values and memory
>>>>>>> addresses.
>>>>>>> >
>>>>>>> > I tried to get into the code. I found that qemu works on the
>>>>>>> instructions in
>>>>>>> > disas_insn() function (at qemu/target-i386/translate.c) , but
>>>>>>> marss transfer
>>>>>>> > the control to ptlsim using gen_helper_switch_to_sim(). I did not
>>>>>>> understand
>>>>>>> > though, what gen_jmp_im(pc_start - s->cs_base) does (line 4080
>>>>>>> > in qemu/target-i386/translate.c).
>>>>>>> > Then I though, why use ptlsim, I can just get the instructions
>>>>>>> from qemu.
>>>>>>> > When I searched for it on the web, I found this document
>>>>>>> >
>>>>>>> http://www.iamroot.org/xe/?module=file&act=procFileDownload&file_srl=37296&sid=1cb6b46c0111f9909279b58df123efa6
>>>>>>> > which explains how to trace instructions using qemu. I tried the
>>>>>>> method they
>>>>>>> > gave within the "Trace instructions in full system emulation"
>>>>>>> section, but
>>>>>>> > somehow I could not make it work.
>>>>>>> > Then I tried using gdb debugger to singlestep through the
>>>>>>> instructions (as
>>>>>>> > explained in
>>>>>>> http://thread.gmane.org/gmane.comp.emulators.qemu/16604), but
>>>>>>> > neither gdb nor singlestep option worked for me with marss. Also
>>>>>>> when I try
>>>>>>> > "printf" in qemu files (e.g. translate.c function disas_insn), it
>>>>>>> does not
>>>>>>> > print anything.
>>>>>>> > I would appreciate if you can point me to the correct functions to
>>>>>>> change,
>>>>>>> > and where-what to print to get the trace file?
>>>>>>> > I also need to get the trace of the micro-ops in the same format I
>>>>>>> explained
>>>>>>> > (micro-op and register). Is there any automatic way to get that?
>>>>>>> If not,
>>>>>>> > what to do to acquire that kind of trace file?
>>>>>>> >>>
>>>>>>> >>> Also I could not make the trace_to_func.py file which Avadh
>>>>>>> gave. It says
>>>>>>> >>> its usage as "trace_to_func.py [options] trace_file outputfile".
>>>>>>> I
>>>>>>> >>> use ptl_rip_trace as trace_file and leave the options empty, but
>>>>>>> it always
>>>>>>> >>> gives the same Usage message.
>>>>>>> >>
>>>>>>> >> Did you specify the 'outputfile' ?
>>>>>>> >
>>>>>>> > Yes, I specified a filename for output. Still the following output
>>>>>>> comes up:
>>>>>>> >
>>>>>>> > $      ./trace_to_func.py ptl_rip_trace output.txt
>>>>>>> > Usage: trace_to_func.py [options] trace_file outputfile
>>>>>>> >
>>>>>>> > trace_to_func.py -h for help
>>>>>>> >
>>>>>>> > Thanks a lot for your help and for the great effort you put into
>>>>>>> marss.
>>>>>>> > Best,
>>>>>>> > Aziz
>>>>>>> > _______________________________________________
>>>>>>> > http://www.marss86.org
>>>>>>> > Marss86-Devel mailing list
>>>>>>> > [email protected]
>>>>>>> > https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
>>>>>>> >
>>>>>>> >
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel

Reply via email to