On Fri, Jun 27, 2014 at 1:57 AM, Nicolas B. Pierron < [email protected]> wrote:
> Yes, the idea I have in mind is to have some-kind of self-hosted > compartment dedicated to analysis where if a function named "xyz" is > declared on the global, then it can be used preferably asynchronously (as > we might not want to pay the cross-compartment call), or synchronously > (waiting for the day we inline cross-compartment calls in ICs / code), or > maybe both. > > In terms of hooks, an API enabling arbitrary program transformation has >> big >> advantages as a basis for implementing dynamic analyses, compared to other >> kinds of API: >> 1) Maximum flexibility for tool builders. You can do essentially anything >> you want with the program execution. >> 2) Simple interface to the underlying VM. So it's easy to maintain as the >> VM evolves. And, importantly, minimal work for Mozilla. >> > > Except if Mozilla is maintaining these tools as we want to rely on these. > For example the Security team wants to rely on some taint analysis or even > other simple analysis for checking if events have been validated before > being processed. > Yes, but we can collaborate on that with a large group of people --- at least, a larger group than the set of people who want to hack on Spidermonkey. > > 3) Potentially very low overhead, because instrumentation code can be >> inlined into application code by the JIT. >> > > I have a question for you, and also for people who have made such analysis > in SpiderMonkey. Why taking all the pain of integrating such analysis in > SpiderMonkey's code, which is hard and change frequently when it would be > easy (based on what you mention) to just do source-to-source transformation? > > Why do we have 3 propositions of implementing taint analysis in > SpiderMonkey so far? It sounds to me that there is something which is not > easily accessible from source-to-source transformation, which might be > easier to get hooked once you are deep inside the engine. > I don't know. One reasonable guess would be that if you're doing a research project and you want to minimize overhead and don't care about maintainability, you can't go wrong by modifying the engine directly. > You identified some disadvantages: >> 1) It may be difficult to keep the language support of code transformation >> tools in sync with Spidermonkey. >> 2) Code transformation tools may introduce application bugs (e.g. by >> polluting the global or due to a bug in translation). >> 3) Transformed code may incur unacceptable slowdown (e.g. due to >> ubiquitous >> boxing). >> (Did I miss anything?) >> > > Source-to-source implies that analysis developers have to know about the > JS implementation, and JS syntax. While such work belongs to the > JavaScript engine developers. > Analysis developers would be exposed to less JS implementation details by working at the source level than by working at the bytecode or some more Spidermonkey-internal level. Yes, they would have to have detailed knowledge of JS syntax and semantics, but that's OK; people developing program analysis frameworks expect to have to know those things :-). And not every analysis developer will have to know everything. A good framework will present higher-level abstractions that make it easy to write simple analyses (while still being possible to write deep ones). I trust Manu and his friends to write a good framework :-). > I think #2 really only matters for people who want to deploy dynamic >> analysis in customer-facing production systems, and I don't think that >> will >> be important anytime soon. >> > > On the contrary, I think/hope we could have trivial taint analysis to > monitor privacy, in a similar way as Lightbeam (Collusion) is doing. > I hesitate to use "trivial" and "taint analysis" in the same sentence, but OK. I still think we can leave this up to the developers of the analysis framework. They are just as smart as us, trust me :-). The asynchronism is one suggestion to make recording analysis faster, by > avoiding frequent cross-compartment calls. I do not see any issue to have > synchronous request, on the contrary I think it might be interesting to > interrupt the program execution on such request, or even change the program > execution (things that we can only do synchronously) to prevent security > holes / privacy leaks. > OK but fast synchronous calls to instrumentation code will very quickly become important. It's not clear to me why we can't have instrumentation code running in the same compartment. I echo what Shu said. Standardizing a code format lower-level than JS syntax seems like a big maintenance burden for Spidermonkey. Better to have a separate front end maintained outside Spidermonkey. These formats have different requirements so it makes sense to allow them to evolve independently. In practice, I don't think keeping the extra front end up to date will be a problem. People are already doing this, e.g. Traceur. Rob -- Jtehsauts tshaei dS,o n" Wohfy Mdaon yhoaus eanuttehrotraiitny eovni le atrhtohu gthot sf oirng iyvoeu rs ihnesa.r"t sS?o Whhei csha iids teoa stiheer :p atroa lsyazye,d 'mYaonu,r "sGients uapr,e tfaokreg iyvoeunr, 'm aotr atnod sgaoy ,h o'mGee.t" uTph eann dt hwea lmka'n? gBoutt uIp waanndt wyeonut thoo mken.o w _______________________________________________ dev-tech-js-engine-internals mailing list [email protected] https://lists.mozilla.org/listinfo/dev-tech-js-engine-internals

