Re: [JS-internals] Dynamic Analysis API discussion

Manu Sridharan Fri, 27 Jun 2014 10:51:14 -0700

Some additional points that hopefully are not entirely redundant with what 
others have already said:


* There is a growing ecosystem of JavaScript parsing and instrumentation 
toolkits, beyond Jalangi, e.g.:

https://github.com/wala/JS_WALA
https://github.com/substack/node-falafel

The nice thing about supporting a source-to-source API is that it will 
encourage experimentation with yet more approaches, which might lead to new 
insights into doing low-overhead instrumentation, what would be appropriate for 
a lower-level API, etc.

* A source-to-source API enables greater portability, e.g., for writing 
analyses / transformations that work for node.js programs and (at least 
partially) on other browsers.  Even if some analyses support FF-specific 
features, probably much of the logic would be shareable across runtimes.

* As Koushik mentioned on another thread, Michael Pradel has already created a 
modified version of Firefox that supports S2S instrumentation:

https://github.com/Berkeley-Correctness-Group/Jalangi-Berkeley

Given that an outside developer could do this without deep expertise on the 
Firefox JS engine, I imagine the maintenance burden of an S2S API would be 
fairly low, making it worth doing even in addition to a lower-level API.

* While the overhead of Jalangi instrumentation is high, this is not 
fundamental.  Instrumentation customized to a particular client could have much 
lower overhead.

* Regarding a lower-level API, one motivating client might be Event Racer:

http://eventracer.org/

Event Racer cannot be built using JS instrumentation alone, as it requires 
detailed information about DOM and event-loop operations.  Right now, it's 
built upon a modified Webkit (with work on porting to Blink in progress).  We 
had a very preliminary discussion with Servo developers about designing an API 
upon which Event Racer could be built, but we didn't pursue it further.  If you 
think supporting such a client analysis might be desirable, I can ask the Event 
Racer developers to chime in with more feedback.

Best,
Manu


On Thursday, June 26, 2014 4:01:38 PM UTC-7, Robert O'Callahan wrote:
> On Fri, Jun 27, 2014 at 1:57 AM, Nicolas B. Pierron <
> 
> [email protected]> wrote:
> 
> 
> 
> > Yes, the idea I have in mind is to have some-kind of self-hosted
> 
> > compartment dedicated to analysis where if a function named "xyz" is
> 
> > declared on the global, then it can be used preferably asynchronously (as
> 
> > we might not want to pay the cross-compartment call), or synchronously
> 
> > (waiting for the day we inline cross-compartment calls in ICs / code), or
> 
> > maybe both.
> 
> >
> 
> >  In terms of hooks, an API enabling arbitrary program transformation has
> 
> >> big
> 
> >> advantages as a basis for implementing dynamic analyses, compared to other
> 
> >> kinds of API:
> 
> >> 1) Maximum flexibility for tool builders. You can do essentially anything
> 
> >> you want with the program execution.
> 
> >> 2) Simple interface to the underlying VM. So it's easy to maintain as the
> 
> >> VM evolves. And, importantly, minimal work for Mozilla.
> 
> >>
> 
> >
> 
> > Except if Mozilla is maintaining these tools as we want to rely on these.
> 
> > For example the Security team wants to rely on some taint analysis or even
> 
> > other simple analysis for checking if events have been validated before
> 
> > being processed.
> 
> >
> 
> 
> 
> Yes, but we can collaborate on that with a large group of people --- at
> 
> least, a larger group than the set of people who want to hack on
> 
> Spidermonkey.
> 
> 
> 
> 
> 
> >
> 
> >  3) Potentially very low overhead, because instrumentation code can be
> 
> >> inlined into application code by the JIT.
> 
> >>
> 
> >
> 
> > I have a question for you, and also for people who have made such analysis
> 
> > in SpiderMonkey.  Why taking all the pain of integrating such analysis in
> 
> > SpiderMonkey's code, which is hard and change frequently when it would be
> 
> > easy (based on what you mention) to just do source-to-source transformation?
> 
> >
> 
> > Why do we have 3 propositions of implementing taint analysis in
> 
> > SpiderMonkey so far?  It sounds to me that there is something which is not
> 
> > easily accessible from source-to-source transformation, which might be
> 
> > easier to get hooked once you are deep inside the engine.
> 
> >
> 
> 
> 
> I don't know. One reasonable guess would be that if you're doing a research
> 
> project and you want to minimize overhead and don't care about
> 
> maintainability, you can't go wrong by modifying the engine directly.
> 
> 
> 
> 
> 
> >  You identified some disadvantages:
> 
> >> 1) It may be difficult to keep the language support of code transformation
> 
> >> tools in sync with Spidermonkey.
> 
> >> 2) Code transformation tools may introduce application bugs (e.g. by
> 
> >> polluting the global or due to a bug in translation).
> 
> >> 3) Transformed code may incur unacceptable slowdown (e.g. due to
> 
> >> ubiquitous
> 
> >> boxing).
> 
> >> (Did I miss anything?)
> 
> >>
> 
> >
> 
> > Source-to-source implies that analysis developers have to know about the
> 
> > JS implementation, and JS syntax.  While such work belongs to the
> 
> > JavaScript engine developers.
> 
> >
> 
> 
> 
> Analysis developers would be exposed to less JS implementation details by
> 
> working at the source level than by working at the bytecode or some more
> 
> Spidermonkey-internal level. Yes, they would have to have detailed
> 
> knowledge of JS syntax and semantics, but that's OK; people developing
> 
> program analysis frameworks expect to have to know those things :-).
> 
> 
> 
> And not every analysis developer will have to know everything. A good
> 
> framework will present higher-level abstractions that make it easy to write
> 
> simple analyses (while still being possible to write deep ones). I trust
> 
> Manu and his friends to write a good framework :-).
> 
> 
> 
> 
> 
> >  I think #2 really only matters for people who want to deploy dynamic
> 
> >> analysis in customer-facing production systems, and I don't think that
> 
> >> will
> 
> >> be important anytime soon.
> 
> >>
> 
> >
> 
> > On the contrary, I think/hope we could have trivial taint analysis to
> 
> > monitor privacy, in a similar way as Lightbeam (Collusion) is doing.
> 
> >
> 
> 
> 
> I hesitate to use "trivial" and "taint analysis" in the same sentence, but
> 
> OK. I still think we can leave this up to the developers of the analysis
> 
> framework. They are just as smart as us, trust me :-).
> 
> 
> 
> The asynchronism is one suggestion to make recording analysis faster, by
> 
> > avoiding frequent cross-compartment calls.  I do not see any issue to have
> 
> > synchronous request, on the contrary I think it might be interesting to
> 
> > interrupt the program execution on such request, or even change the program
> 
> > execution (things that we can only do synchronously) to prevent security
> 
> > holes / privacy leaks.
> 
> >
> 
> 
> 
> OK but fast synchronous calls to instrumentation code will very quickly
> 
> become important. It's not clear to me why we can't have instrumentation
> 
> code running in the same compartment.
> 
> 
> 
> I echo what Shu said. Standardizing a code format lower-level than JS
> 
> syntax seems like a big maintenance burden for Spidermonkey. Better to have
> 
> a separate front end maintained outside Spidermonkey. These formats have
> 
> different requirements so it makes sense to allow them to evolve
> 
> independently. In practice, I don't think keeping the extra front end up to
> 
> date will be a problem. People are already doing this, e.g. Traceur.
> 
> 
> 
> Rob
> 
> -- 
> 
> Jtehsauts  tshaei dS,o n" Wohfy  Mdaon  yhoaus  eanuttehrotraiitny  eovni
> 
> le atrhtohu gthot sf oirng iyvoeu rs ihnesa.r"t sS?o  Whhei csha iids  teoa
> 
> stiheer :p atroa lsyazye,d  'mYaonu,r  "sGients  uapr,e  tfaokreg iyvoeunr,
> 
> 'm aotr  atnod  sgaoy ,h o'mGee.t"  uTph eann dt hwea lmka'n?  gBoutt  uIp
> 
> waanndt  wyeonut  thoo mken.o w

_______________________________________________
dev-tech-js-engine-internals mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

Re: [JS-internals] Dynamic Analysis API discussion

Reply via email to