Re: [JS-internals] Dynamic Analysis API discussion

Robert O'Callahan Thu, 26 Jun 2014 04:50:46 -0700

Your email is unclear as to whether you're proposing integrating some
particular analysis engine or framework into Spidermonkey (or more than
one), or just some minimal set of hooks to enable others to supply such
engines/frameworks. I'm going to assume the latter since I think the former
makes no sense at all.

In terms of hooks, an API enabling arbitrary program transformation has big
advantages as a basis for implementing dynamic analyses, compared to other
kinds of API:
1) Maximum flexibility for tool builders. You can do essentially anything
you want with the program execution.
2) Simple interface to the underlying VM. So it's easy to maintain as the
VM evolves. And, importantly, minimal work for Mozilla.
3) Potentially very low overhead, because instrumentation code can be
inlined into application code by the JIT.
I spent a few years writing dynamic analysis tools for Java, and they all
used bytecode transformation for all these reasons.

You identified some disadvantages:
1) It may be difficult to keep the language support of code transformation
tools in sync with Spidermonkey.
2) Code transformation tools may introduce application bugs (e.g. by
polluting the global or due to a bug in translation).
3) Transformed code may incur unacceptable slowdown (e.g. due to ubiquitous
boxing).
(Did I miss anything?)

I think #2 really only matters for people who want to deploy dynamic
analysis in customer-facing production systems, and I don't think that will
be important anytime soon.

#1 doesn't seem like a big problem to me. Extending a JS parser is not that
hard. New language features with complex semantics require significant tool
updates whatever API we use. If we're using these tools ourselves, we'd
have to update the tools sometime between landing the feature in
Spidermonkey and starting to use it in FirefoxOS or elsewhere where we're
depending on analysis.

#3 is interesting and perhaps where lessons learned from Java and other
contexts do not apply. I think we should dig into specific tool examples
for this; maybe some combination of more intelligent translation and
judicious API extensions can solve the problems.

Nicolas B. Pierron wrote:

> Personally, I think that these issues implies that we should avoid relying
> on a source-to-source mapping if we want to provide meaningful security
> results. We could replicate the same or a similar API in SpiderMonkey, and
> even make one compatible with Jalangi analysis.
>
>
It's not clear what you mean by "the same or a similar API" here.

If we add opcodes dedicated to monitor values (at the bytecode emitter
> level), instead of doing source-to-source transformation. One of the
> advantage would be that frontend developers would not have to maintain
> Jalangi sources when we are adding new features in SpiderMonkey, and more
> over, the bytecode emitter already breakdown everything to opcodes, which
> are easier to wrap than the source.
>
> Analysis are usually made to observe the execution of a code, and not to
> mutate it.  So if we only monitor the execution, instead of emulating it, we
> might be able to batch analysis calls.  Doing batches asynchronously implies
> that the overhead of running an analysis is  minimal while the analyzed code
> is running.
>
>
Logging and log analysis have their place, but a lot of dynamic analysis
tools rely on efficient synchronous online data processing in
instrumentation code. For example, if you want to count the number of times
a program point is reached, it's much more efficient to increment a global
variable at that program point than to log to a buffer every time that
point is reached, and count log entries offline. For many analyses of
real-world applications, high-volume data logging is neither efficient nor
scalable. Here are a couple of examples of Java tools I worked on where
synchronous online data processing was essential:
-- http://fsl.cs.illinois.edu/images/e/e8/P385-goldsmith.pdf
-- http://web5.cs.columbia.edu/~junfeng/09fa-e6998/papers/hybrid.pdf
So I think injection of synchronously executed instrumentation is essential
for a large class of analyses.

Rob
-- 
Jtehsauts  tshaei dS,o n" Wohfy  Mdaon  yhoaus  eanuttehrotraiitny  eovni
le atrhtohu gthot sf oirng iyvoeu rs ihnesa.r"t sS?o  Whhei csha iids  teoa
stiheer :p atroa lsyazye,d  'mYaonu,r  "sGients  uapr,e  tfaokreg iyvoeunr,
'm aotr  atnod  sgaoy ,h o'mGee.t"  uTph eann dt hwea lmka'n?  gBoutt  uIp
waanndt  wyeonut  thoo mken.o w
_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

Re: [JS-internals] Dynamic Analysis API discussion

Reply via email to