Re: [JS-internals] Dynamic Analysis API discussion

Robert O'Callahan Thu, 26 Jun 2014 16:02:29 -0700

On Fri, Jun 27, 2014 at 1:57 AM, Nicolas B. Pierron <
[email protected]> wrote:


> Yes, the idea I have in mind is to have some-kind of self-hosted
> compartment dedicated to analysis where if a function named "xyz" is
> declared on the global, then it can be used preferably asynchronously (as
> we might not want to pay the cross-compartment call), or synchronously
> (waiting for the day we inline cross-compartment calls in ICs / code), or
> maybe both.
>
>  In terms of hooks, an API enabling arbitrary program transformation has
>> big
>> advantages as a basis for implementing dynamic analyses, compared to other
>> kinds of API:
>> 1) Maximum flexibility for tool builders. You can do essentially anything
>> you want with the program execution.
>> 2) Simple interface to the underlying VM. So it's easy to maintain as the
>> VM evolves. And, importantly, minimal work for Mozilla.
>>
>
> Except if Mozilla is maintaining these tools as we want to rely on these.
> For example the Security team wants to rely on some taint analysis or even
> other simple analysis for checking if events have been validated before
> being processed.
>

Yes, but we can collaborate on that with a large group of people --- at
least, a larger group than the set of people who want to hack on
Spidermonkey.


>
>  3) Potentially very low overhead, because instrumentation code can be
>> inlined into application code by the JIT.
>>
>
> I have a question for you, and also for people who have made such analysis
> in SpiderMonkey.  Why taking all the pain of integrating such analysis in
> SpiderMonkey's code, which is hard and change frequently when it would be
> easy (based on what you mention) to just do source-to-source transformation?
>
> Why do we have 3 propositions of implementing taint analysis in
> SpiderMonkey so far?  It sounds to me that there is something which is not
> easily accessible from source-to-source transformation, which might be
> easier to get hooked once you are deep inside the engine.
>

I don't know. One reasonable guess would be that if you're doing a research
project and you want to minimize overhead and don't care about
maintainability, you can't go wrong by modifying the engine directly.


>  You identified some disadvantages:
>> 1) It may be difficult to keep the language support of code transformation
>> tools in sync with Spidermonkey.
>> 2) Code transformation tools may introduce application bugs (e.g. by
>> polluting the global or due to a bug in translation).
>> 3) Transformed code may incur unacceptable slowdown (e.g. due to
>> ubiquitous
>> boxing).
>> (Did I miss anything?)
>>
>
> Source-to-source implies that analysis developers have to know about the
> JS implementation, and JS syntax.  While such work belongs to the
> JavaScript engine developers.
>

Analysis developers would be exposed to less JS implementation details by
working at the source level than by working at the bytecode or some more
Spidermonkey-internal level. Yes, they would have to have detailed
knowledge of JS syntax and semantics, but that's OK; people developing
program analysis frameworks expect to have to know those things :-).

And not every analysis developer will have to know everything. A good
framework will present higher-level abstractions that make it easy to write
simple analyses (while still being possible to write deep ones). I trust
Manu and his friends to write a good framework :-).


>  I think #2 really only matters for people who want to deploy dynamic
>> analysis in customer-facing production systems, and I don't think that
>> will
>> be important anytime soon.
>>
>
> On the contrary, I think/hope we could have trivial taint analysis to
> monitor privacy, in a similar way as Lightbeam (Collusion) is doing.
>

I hesitate to use "trivial" and "taint analysis" in the same sentence, but
OK. I still think we can leave this up to the developers of the analysis
framework. They are just as smart as us, trust me :-).

The asynchronism is one suggestion to make recording analysis faster, by
> avoiding frequent cross-compartment calls.  I do not see any issue to have
> synchronous request, on the contrary I think it might be interesting to
> interrupt the program execution on such request, or even change the program
> execution (things that we can only do synchronously) to prevent security
> holes / privacy leaks.
>

OK but fast synchronous calls to instrumentation code will very quickly
become important. It's not clear to me why we can't have instrumentation
code running in the same compartment.

I echo what Shu said. Standardizing a code format lower-level than JS
syntax seems like a big maintenance burden for Spidermonkey. Better to have
a separate front end maintained outside Spidermonkey. These formats have
different requirements so it makes sense to allow them to evolve
independently. In practice, I don't think keeping the extra front end up to
date will be a problem. People are already doing this, e.g. Traceur.

Rob
-- 
Jtehsauts  tshaei dS,o n" Wohfy  Mdaon  yhoaus  eanuttehrotraiitny  eovni
le atrhtohu gthot sf oirng iyvoeu rs ihnesa.r"t sS?o  Whhei csha iids  teoa
stiheer :p atroa lsyazye,d  'mYaonu,r  "sGients  uapr,e  tfaokreg iyvoeunr,
'm aotr  atnod  sgaoy ,h o'mGee.t"  uTph eann dt hwea lmka'n?  gBoutt  uIp
waanndt  wyeonut  thoo mken.o w
_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

Re: [JS-internals] Dynamic Analysis API discussion

Reply via email to