Re: [DynInst_API:] ParseAPI and PE files

2014-04-17 Thread Bill Williams

On 04/16/2014 06:13 AM, E.Robbins wrote:

Hi,
we are trying to use the ParseAPI with PE files. Even with the simple example 
in the ParseAPI manual, we get an error:

[SymtabCodeSource.C] FATAL: can't create Symtab object for file executable 
name

It appears that the SymtabCodeSource does not like PE files. Is this a known 
issue, and/or is there a work around?


Ed--

Not only is this not a known issue, but it's known to work--the 
SymtabCodeSource is used internally by Dyninst for all of its parsing, 
and Windows isn't *that* broken. (At least not if you're working from 
any remotely stable point, it's not.)


I've been seeing issues with path names with some frequency, though; 
Symtab will open paths of the standard drive-letter form. Both the 
\\device\whatever and the cygwin forms do not get converted 
automatically, and that can prevent Symtab from opening a file.


If executable name is in drive-letter form, absolute path, exists, 
permissions are good, etc, then this merits further investigation. Are 
you working on 8.1.2, master, the 8.2 branch...?


Oh. One other thing--if you're trying to analyze PE files on Linux, 
that's not presently going to work. It might be possible, if you have a 
Linux system with the necessary Windows headers present and you know of 
a replacement for the debug SDK, to coerce a Linux build of Symtab to 
speak PE. You could probably pull the text section out via objdump or 
similar and stuff it into a fake ELF file. I think I also have an 
memory-backed CodeSource implementation floating around somewhere that 
you could use as a starting point--as long as you can find the text 
section and either don't care about symbols or can find them without 
Windows headers, mocking up a CodeSource that speaks PE on Linux is a 
simple matter of engineering. It's engineering we haven't done because 
parsing PE on Linux is not of much use to Dyninst without a *very* 
full-featured cross-format Symtab backing it, such that we could rewrite 
PE files on Linux.



Thanks,
Ed
___
Dyninst-api mailing list
Dyninst-api@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api




--
--bw

Bill Williams
Paradyn Project
b...@cs.wisc.edu
___
Dyninst-api mailing list
Dyninst-api@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api


Re: [DynInst_API:] ParseAPI and PE files

2014-04-17 Thread E.Robbins
To be clear, I don't think this is a problem with the way we are using the 
ParseAPI, as I've written several programs that all work fine for ELF binaries, 
but not PE.

The failure is almost immediate when trying to create the SymtabCodeSource 
object:

SymtabCodeSource ∗sts = new SymtabCodeSource(binary_path);

Ed


From: Dyninst-api [dyninst-api-boun...@cs.wisc.edu] on behalf of E.Robbins 
[er...@kent.ac.uk]
Sent: 16 April 2014 12:13
To: dyninst-api@cs.wisc.edu
Subject: [DynInst_API:] ParseAPI and PE files

Hi,
we are trying to use the ParseAPI with PE files. Even with the simple example 
in the ParseAPI manual, we get an error:

[SymtabCodeSource.C] FATAL: can't create Symtab object for file executable 
name

It appears that the SymtabCodeSource does not like PE files. Is this a known 
issue, and/or is there a work around?

Thanks,
Ed
___
Dyninst-api mailing list
Dyninst-api@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api

___
Dyninst-api mailing list
Dyninst-api@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api

Re: [DynInst_API:] ParseAPI and PE files

2014-04-17 Thread E.Robbins
Sorry, I just saw Bill's reply.


From: E.Robbins
Sent: 17 April 2014 16:52
To: dyninst-api@cs.wisc.edu
Subject: RE: ParseAPI and PE files

To be clear, I don't think this is a problem with the way we are using the 
ParseAPI, as I've written several programs that all work fine for ELF binaries, 
but not PE.

The failure is almost immediate when trying to create the SymtabCodeSource 
object:

SymtabCodeSource ∗sts = new SymtabCodeSource(binary_path);

Ed


From: Dyninst-api [dyninst-api-boun...@cs.wisc.edu] on behalf of E.Robbins 
[er...@kent.ac.uk]
Sent: 16 April 2014 12:13
To: dyninst-api@cs.wisc.edu
Subject: [DynInst_API:] ParseAPI and PE files

Hi,
we are trying to use the ParseAPI with PE files. Even with the simple example 
in the ParseAPI manual, we get an error:

[SymtabCodeSource.C] FATAL: can't create Symtab object for file executable 
name

It appears that the SymtabCodeSource does not like PE files. Is this a known 
issue, and/or is there a work around?

Thanks,
Ed
___
Dyninst-api mailing list
Dyninst-api@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api

___
Dyninst-api mailing list
Dyninst-api@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api

Re: [DynInst_API:] ParseAPI and PE files

2014-04-17 Thread E.Robbins
On 17/04/2014 16:40 PM Bill Williams wrote:
 Oh. One other thing--if you're trying to analyze PE files on Linux,
 that's not presently going to work. It might be possible, if you have a
 Linux system with the necessary Windows headers present and you know of
 a replacement for the debug SDK, to coerce a Linux build of Symtab to
 speak PE. 

Thanks. We are indeed trying to analyse PE files in Linux. I didn't realise 
that this wasn't supported. When you say the debug SDK, do you mean some kind 
of MS VS debugger?

 You could probably pull the text section out via objdump or
 similar and stuff it into a fake ELF file. 

 We'll have to think about that, but it's certainly an option in the short 
 term I guess. We are mostly looking at malware so symbols are mostly useless, 
 but we probably will need to know about linkage, the entry point etc etc.

 I think I also have an
 memory-backed CodeSource implementation floating around somewhere that
 you could use as a starting point--as long as you can find the text
 section and either don't care about symbols or can find them without
 Windows headers, mocking up a CodeSource that speaks PE on Linux is a
 simple matter of engineering. 

What do you mean by a memory-backed CodeSource? We would be interested in 
anything that can help, though obviously we may decide it's too big a task.

 It's engineering we haven't done because
 parsing PE on Linux is not of much use to Dyninst without a *very*
 full-featured cross-format Symtab backing it, such that we could rewrite
 PE files on Linux.

Fair enough... we are somewhat at odds with the goals of dyninst because we are 
doing static analysis and mostly use it for its control flow recovery which is 
very good, and to some extent for reading symbols too.

The obvious answer then is to use windows. Can the windows version of dyninst 
work over ELF binaries?

Thanks a lot,
Ed

___
Dyninst-api mailing list
Dyninst-api@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api


Re: [DynInst_API:] ParseAPI and PE files

2014-04-17 Thread Bill Williams

On 04/17/2014 11:08 AM, E.Robbins wrote:

On 17/04/2014 16:40 PM Bill Williams wrote:

Oh. One other thing--if you're trying to analyze PE files on Linux,
that's not presently going to work. It might be possible, if you have a
Linux system with the necessary Windows headers present and you know of
a replacement for the debug SDK, to coerce a Linux build of Symtab to
speak PE.


Thanks. We are indeed trying to analyse PE files in Linux. I didn't realise 
that this wasn't supported. When you say the debug SDK, do you mean some kind 
of MS VS debugger?

No, there's a Debug Information Access (DIA) SDK that's available and 
does a fair bit of symbol parsing for PE files that we then bake into 
Symtab form. Its accessibility and redistributability have been somewhat 
variable IIRC but if you have a non-Express version of Visual Studio, 
you have full access to MS SDKs last I checked.



You could probably pull the text section out via objdump or
similar and stuff it into a fake ELF file.



We'll have to think about that, but it's certainly an option in the short term 
I guess. We are mostly looking at malware so symbols are mostly useless, but we 
probably will need to know about linkage, the entry point etc etc.


Yeah, if you're concerned with malware, you'll probably want a custom 
CodeSource whether you're working on Linux or Windows.



I think I also have an
memory-backed CodeSource implementation floating around somewhere that
you could use as a starting point--as long as you can find the text
section and either don't care about symbols or can find them without
Windows headers, mocking up a CodeSource that speaks PE on Linux is a
simple matter of engineering.


What do you mean by a memory-backed CodeSource? We would be interested in 
anything that can help, though obviously we may decide it's too big a task.

The two CodeSource implementations we distribute are Symtab and Symlite 
based; both of these work with files on disk, as one does. For internal 
testing purposes, it's convenient to be able to splat a blob of code 
into memory and parse it, and the CodeSource implementation to do that 
is what I mean by memory-backed.


It occurs to me that what you actually may want is a Windows 
implementation of SymLite--something that knows how to mmap in sections, 
read section headers and the entry point, and optionally can give you a 
lightweight representation of each symbol. That would then plug into the 
existing SymLiteCodeSource and should work seamlessly.



It's engineering we haven't done because
parsing PE on Linux is not of much use to Dyninst without a *very*
full-featured cross-format Symtab backing it, such that we could rewrite
PE files on Linux.


Fair enough... we are somewhat at odds with the goals of dyninst because we are 
doing static analysis and mostly use it for its control flow recovery which is 
very good, and to some extent for reading symbols too.

The obvious answer then is to use windows. Can the windows version of dyninst 
work over ELF binaries?

The Windows version doesn't speak ELF, though that's I think more 
practical than getting Linux Dyninst to speak PE fully. I haven't 
checked recently whether libelf/libdwarf build cleanly on Windows, but 
if they do then that's a pretty straightforward project; we'd build with 
libelf, libdwarf, and the dynElf/dynDwarf wrapper libraries, add the 
various ELF-reading source files to the build, and toss a mechanism into 
Symtab::openFile to check the file type and open things on the right 
path. (Note, straightforward does not mean low-effort; we'd almost 
certainly have to redesign some of the class structure so that the file 
type could be a runtime decision rather than a build-time one. But it's 
doable in principle.)


I think your best bet would be a cross-platform PE implementation of 
SymLite, though.



Thanks a lot,
Ed




--
--bw

Bill Williams
Paradyn Project
b...@cs.wisc.edu
___
Dyninst-api mailing list
Dyninst-api@cs.wisc.edu
https://lists.cs.wisc.edu/mailman/listinfo/dyninst-api