Re: [DynInst_API:] ParseAPI and PE files
On 04/16/2014 06:13 AM, E.Robbins wrote: Hi, we are trying to use the ParseAPI with PE files. Even with the simple example in the ParseAPI manual, we get an error: [SymtabCodeSource.C] FATAL: can't create Symtab object for file executable name It appears that the SymtabCodeSource does not like PE files. Is this a known issue, and/or is there a work around? Ed-- Not only is this not a known issue, but it's known to work--the SymtabCodeSource is used internally by Dyninst for all of its parsing, and Windows isn't *that* broken. (At least not if you're working from any remotely stable point, it's not.) I've been seeing issues with path names with some frequency, though; Symtab will open paths of the standard drive-letter form. Both the \\device\whatever and the cygwin forms do not get converted automatically, and that can prevent Symtab from opening a file. If executable name is in drive-letter form, absolute path, exists, permissions are good, etc, then this merits further investigation. Are you working on 8.1.2, master, the 8.2 branch...? Oh. One other thing--if you're trying to analyze PE files on Linux, that's not presently going to work. It might be possible, if you have a Linux system with the necessary Windows headers present and you know of a replacement for the debug SDK, to coerce a Linux build of Symtab to speak PE. You could probably pull the text section out via objdump or similar and stuff it into a fake ELF file. I think I also have an memory-backed CodeSource implementation floating around somewhere that you could use as a starting point--as long as you can find the text section and either don't care about symbols or can find them without Windows headers, mocking up a CodeSource that speaks PE on Linux is a simple matter of engineering. It's engineering we haven't done because parsing PE on Linux is not of much use to Dyninst without a *very* full-featured cross-format Symtab backing it, such that we could rewrite PE files on Linux. Thanks, Ed ___ Dyninst-api mailing list Dyninst-api@cs.wisc.edu https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api -- --bw Bill Williams Paradyn Project b...@cs.wisc.edu ___ Dyninst-api mailing list Dyninst-api@cs.wisc.edu https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api
Re: [DynInst_API:] ParseAPI and PE files
To be clear, I don't think this is a problem with the way we are using the ParseAPI, as I've written several programs that all work fine for ELF binaries, but not PE. The failure is almost immediate when trying to create the SymtabCodeSource object: SymtabCodeSource ∗sts = new SymtabCodeSource(binary_path); Ed From: Dyninst-api [dyninst-api-boun...@cs.wisc.edu] on behalf of E.Robbins [er...@kent.ac.uk] Sent: 16 April 2014 12:13 To: dyninst-api@cs.wisc.edu Subject: [DynInst_API:] ParseAPI and PE files Hi, we are trying to use the ParseAPI with PE files. Even with the simple example in the ParseAPI manual, we get an error: [SymtabCodeSource.C] FATAL: can't create Symtab object for file executable name It appears that the SymtabCodeSource does not like PE files. Is this a known issue, and/or is there a work around? Thanks, Ed ___ Dyninst-api mailing list Dyninst-api@cs.wisc.edu https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api ___ Dyninst-api mailing list Dyninst-api@cs.wisc.edu https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api
Re: [DynInst_API:] ParseAPI and PE files
Sorry, I just saw Bill's reply. From: E.Robbins Sent: 17 April 2014 16:52 To: dyninst-api@cs.wisc.edu Subject: RE: ParseAPI and PE files To be clear, I don't think this is a problem with the way we are using the ParseAPI, as I've written several programs that all work fine for ELF binaries, but not PE. The failure is almost immediate when trying to create the SymtabCodeSource object: SymtabCodeSource ∗sts = new SymtabCodeSource(binary_path); Ed From: Dyninst-api [dyninst-api-boun...@cs.wisc.edu] on behalf of E.Robbins [er...@kent.ac.uk] Sent: 16 April 2014 12:13 To: dyninst-api@cs.wisc.edu Subject: [DynInst_API:] ParseAPI and PE files Hi, we are trying to use the ParseAPI with PE files. Even with the simple example in the ParseAPI manual, we get an error: [SymtabCodeSource.C] FATAL: can't create Symtab object for file executable name It appears that the SymtabCodeSource does not like PE files. Is this a known issue, and/or is there a work around? Thanks, Ed ___ Dyninst-api mailing list Dyninst-api@cs.wisc.edu https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api ___ Dyninst-api mailing list Dyninst-api@cs.wisc.edu https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api
Re: [DynInst_API:] ParseAPI and PE files
On 17/04/2014 16:40 PM Bill Williams wrote: Oh. One other thing--if you're trying to analyze PE files on Linux, that's not presently going to work. It might be possible, if you have a Linux system with the necessary Windows headers present and you know of a replacement for the debug SDK, to coerce a Linux build of Symtab to speak PE. Thanks. We are indeed trying to analyse PE files in Linux. I didn't realise that this wasn't supported. When you say the debug SDK, do you mean some kind of MS VS debugger? You could probably pull the text section out via objdump or similar and stuff it into a fake ELF file. We'll have to think about that, but it's certainly an option in the short term I guess. We are mostly looking at malware so symbols are mostly useless, but we probably will need to know about linkage, the entry point etc etc. I think I also have an memory-backed CodeSource implementation floating around somewhere that you could use as a starting point--as long as you can find the text section and either don't care about symbols or can find them without Windows headers, mocking up a CodeSource that speaks PE on Linux is a simple matter of engineering. What do you mean by a memory-backed CodeSource? We would be interested in anything that can help, though obviously we may decide it's too big a task. It's engineering we haven't done because parsing PE on Linux is not of much use to Dyninst without a *very* full-featured cross-format Symtab backing it, such that we could rewrite PE files on Linux. Fair enough... we are somewhat at odds with the goals of dyninst because we are doing static analysis and mostly use it for its control flow recovery which is very good, and to some extent for reading symbols too. The obvious answer then is to use windows. Can the windows version of dyninst work over ELF binaries? Thanks a lot, Ed ___ Dyninst-api mailing list Dyninst-api@cs.wisc.edu https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api
Re: [DynInst_API:] ParseAPI and PE files
On 04/17/2014 11:08 AM, E.Robbins wrote: On 17/04/2014 16:40 PM Bill Williams wrote: Oh. One other thing--if you're trying to analyze PE files on Linux, that's not presently going to work. It might be possible, if you have a Linux system with the necessary Windows headers present and you know of a replacement for the debug SDK, to coerce a Linux build of Symtab to speak PE. Thanks. We are indeed trying to analyse PE files in Linux. I didn't realise that this wasn't supported. When you say the debug SDK, do you mean some kind of MS VS debugger? No, there's a Debug Information Access (DIA) SDK that's available and does a fair bit of symbol parsing for PE files that we then bake into Symtab form. Its accessibility and redistributability have been somewhat variable IIRC but if you have a non-Express version of Visual Studio, you have full access to MS SDKs last I checked. You could probably pull the text section out via objdump or similar and stuff it into a fake ELF file. We'll have to think about that, but it's certainly an option in the short term I guess. We are mostly looking at malware so symbols are mostly useless, but we probably will need to know about linkage, the entry point etc etc. Yeah, if you're concerned with malware, you'll probably want a custom CodeSource whether you're working on Linux or Windows. I think I also have an memory-backed CodeSource implementation floating around somewhere that you could use as a starting point--as long as you can find the text section and either don't care about symbols or can find them without Windows headers, mocking up a CodeSource that speaks PE on Linux is a simple matter of engineering. What do you mean by a memory-backed CodeSource? We would be interested in anything that can help, though obviously we may decide it's too big a task. The two CodeSource implementations we distribute are Symtab and Symlite based; both of these work with files on disk, as one does. For internal testing purposes, it's convenient to be able to splat a blob of code into memory and parse it, and the CodeSource implementation to do that is what I mean by memory-backed. It occurs to me that what you actually may want is a Windows implementation of SymLite--something that knows how to mmap in sections, read section headers and the entry point, and optionally can give you a lightweight representation of each symbol. That would then plug into the existing SymLiteCodeSource and should work seamlessly. It's engineering we haven't done because parsing PE on Linux is not of much use to Dyninst without a *very* full-featured cross-format Symtab backing it, such that we could rewrite PE files on Linux. Fair enough... we are somewhat at odds with the goals of dyninst because we are doing static analysis and mostly use it for its control flow recovery which is very good, and to some extent for reading symbols too. The obvious answer then is to use windows. Can the windows version of dyninst work over ELF binaries? The Windows version doesn't speak ELF, though that's I think more practical than getting Linux Dyninst to speak PE fully. I haven't checked recently whether libelf/libdwarf build cleanly on Windows, but if they do then that's a pretty straightforward project; we'd build with libelf, libdwarf, and the dynElf/dynDwarf wrapper libraries, add the various ELF-reading source files to the build, and toss a mechanism into Symtab::openFile to check the file type and open things on the right path. (Note, straightforward does not mean low-effort; we'd almost certainly have to redesign some of the class structure so that the file type could be a runtime decision rather than a build-time one. But it's doable in principle.) I think your best bet would be a cross-platform PE implementation of SymLite, though. Thanks a lot, Ed -- --bw Bill Williams Paradyn Project b...@cs.wisc.edu ___ Dyninst-api mailing list Dyninst-api@cs.wisc.edu https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api