Excellent! Thanks, Matthew and Ryan for figuring out what appears to be a promising approach to a gaping hole in our security picture.
Robby On Thu, Jun 30, 2011 at 9:19 AM, Matthew Flatt <[email protected]> wrote: > I've pushed a change to Racket's macro system that throws out the > syntax-certificate system and adds a syntax-taint system. > > > Syntax taints, like syntax certificates before, are intended to > protect macro expansions from abuse. "Abuse" means using > `local-expand' to extract a piece of an expansion, then putting the > piece in a new context or using `datum->syntax' to access unexported > module bindings using the piece's lexical context. Meanwhile, > program-processing tools like `errortrace' or languages like > `typed/racket' are supposed to pull apart expansions and reorganize > them; code inspectors remain the way to distinguish trusted tools and > languages from potential abusers. > > Things you need to know: > > * When writing a macro (without `syntax-rules', `define-syntax-rule', > or `syntax-id-rules'), apply `syntax-protect' to the macro's result > syntax object to protect it from abuse. The `syntax-rules', etc., > forms use `syntax-protect' automatically. > > * If you write program-processing tools or languages that use > `syntax-recertify', you'll need to change them to use > `syntax-disarm' and possibly `syntax-rearm'. > > > More Background > --------------- > > Combining extensibility with security is a difficult problem in most > any domain. Macros are an important tool for language extensibility, > and `local-expand' enables an important class of language extensions > in Racket, but `local-expand' also opens the door to macro bindings > that break abstractions --- reaching into a macro expansion to access > bindings that are intended to be private. > > Syntax certificates were designed to protect private bindings in the > presence of `local-expand'. The idea was that a macro expansion can be > certified as originating from a particular module, and the use of > unexported identifiers would be allowed based on that > certification. Pulling apart a syntax object loses certificates, so > unexported bindings cannot be extracted and used in contexts which the > defining module would not approve. > > Certificates didn't work well enough: > > * The distinction between unexported bindings and other bindings > meant that certificate-maintenance problems were often detected > late, since detecting problems required an example that spans > module boundaries and involves unexported identifier. > > * Since certificates are module-specific, identifiers introduced in a > macro expansion must be distinguished from those present in a macro > use; certificates thus get tangled up in the macro system's hygiene > machinery. Macro-generating macros create particular complexities > when the generator macro lives in one module and the generated > macro lives in another module. The system of "lifting" certificates > and "inactive" versus "active" certificates, which evolved to > address this problem, was difficult to reason about. In the case of > recent reported problems with certificates, it's still not clear > whether the certificate system was buggy or misused. > > * We found two holes in the certificate system: > > - Grabbing a macro transformer by `syntax-local-value' and > applying it directly could provide access to identifiers with > "inactive" certificates. The inactive certificates could then be > lifted to any use context of the macro, defeating the intended > protection of certificates. > > - Using `local-expand' on method names within a class could expose > locally bound variables that were intended as private (i.e., > they had no protection because they were not module-bound > variables). > > Ryan and I have been looking for a better solution for a long time. I > think Ryan has solved the problem with the design of the taint > system. Proving that syntax taints provide security as intended > remains a steep challenge, but the design closes the known hole with > certificates, and our intuition is that they should work better in > practice: > > * Taints apply uniformly to all identifiers, including local > identifiers, identifiers in the same module, exported identifiers, > and unexported identifiers. Taint-management problems in syntax > tools should therefore become apparent more readily. > > * Taints are not module-specific, which detangles them from hygiene > machinery and avoids many other design and implementation problems. > > One indicator of this improvement is that the total size of ".zo" > files in the Racket installation is 20% smaller with taints in > place of certificates. (DrRacket's footprint is about 5% smaller, > and start-up time for `racket' is slightly shorter. Interestingly, > `raco setup' time is not significantly affected --- perhaps because > certificate tracking had been such a bottleneck in the past that I > spent a lot of time making it fast.) The "syntax.c" file is 1000 > lines shorter. > > The trigger for a taint (a "dye pack" --- see below) is > inspector-specific, so macro-generating macros still require some > care in case the generator and the generated macro are under > different code inspectors; this extra work is still easier to > reason about than inactive certificates. > > * The two known holes are fixed: > > - Using `syntax-local-value' to access a transformer directly no > longer circumvents the system, because the transformer itself > arms a syntax object, rather than making protection the > responsibility of the macro expander that invokes the > transformer. To some degree, this change also allows a macro > implementor to opt out of the taint system. > > A drawback of the change is that a macro implementor must > explicitly opt in to protection of its enclosing module --- at > least when not using `syntax-rules', `define-syntax-rule', or > `syntax-id-rules', all of which arm the resulting syntax object > automatically. This drawback is similar to requiring module > implementors to use `protect-out' on identifiers that should > not be accessible to untrusted modules (so it's not a new > problem, in some sense). > > - Taints protect all variables, including the private local > bindings in the expansion of `class'. > > Overall, Ryan and I expect that ease of reasoning will be the main > advantage of traits over certificates. > > > Taints API Overview > ------------------- > > "Tainted" identifiers cannot be compiled or expanded in either binding > or use positions. A syntax object is not usually tainted directly > directly. Instead, a "dye pack" is added to a syntax object with > `syntax-arm', which produces an "armed" syntax object; if `stx' is > armed, then `(syntax-e stx)' or `(datum->syntax stx ....)' produces a > tainted syntax object. A syntax object can be "disarmed" using a > suitable code inspector. > > > (syntax-arm stx [insp use-mode?]) -> syntax? > stx : syntax? > insp : (or/c inspector? #f) > use-mode? : any/c = #f > > Arms `new-stx' with a dye pack that is controlled by `insp'. If > `insp' is #f, then during the dynamic extent of a transformer, it is > replaced with the instantiation-time inspector of the module where a > macro transformer is defined; if no transformation is in progress, > `(current-code-inspector)' is used. > > If `use-mode?' is a true value, then a 'taint-mode property on `stx' > can push dye packs to nested syntax objects, and certain forms like > `begin' and `define-values' also push the dye packs down. This is > the same as the way that 'certify-mode and the macro expander worked > before, and 'certify-mode is supported as an alias for 'taint-mode > for backward compatibility. If a 'taint-mode specifies 'transparent > for a syntax object, then the syntax object's lexical information is > removed as the dye pack is pushed to its children, so no lexical > information is left unprotected. > > > (syntax-disarm stx [insp]) > stx : syntax? > insp : (or/c inspector? #f) = #f > > Removes immediate dye packs (not taints) from `stx' (not from nested > syntax objects in `stx') for which `insp' is powerful enough. > > > (syntax-rearm stx from-stx use-mode?) > stx : syntax? > from-stx : syntax? > use-mode? : any/c = #f > > Copies immediate dye packs and taints from `from-stx' to `stx'. If > `use-mode?' is true, the copies are pushed to internal syntax > object, as in `syntax-arm'. > > > (syntax-protect stx) -> syntax? > stx : syntax? > > An alias for `(syntax-arm stx #f #t)'. A macro typically uses this > function to protect its result; it's built into `syntax-rules', > `syntax-id-rules', and `define-syntax-rule'. > > > (syntax-tainted? stx) -> boolean? > stx : syntax? > > Returns #t if `stx' is tainted, #f otherwise. > > > (syntax-taint stx) -> syntax? > stx : syntax? > > Equivalent to `(datum->syntax (syntax-arm stx) (syntax-e stx))', or > just `stx' if it is already tainted. > > > The macro expander disarms a syntax object (using an all-powerful > inspector) before it applies a macro transformer to the syntax object. > The macro expander rearms the result of a transformer using the > original syntax object (i.e., any dye packs that were stripped from > the transfomer input are copied to the transformer output). The > 'taint-mode (or 'certify-mode, for compatibility) property is used for > rearming. > > Only expressions and `#%module-begin' forms should be armed. In > particular, the macro expander disarms only expressions and module > bodies before pulling them apart, and so other tools only need to > worry about taint armings at those levels. > > See the Guide section 16.2.5 for more information. > > _________________________________________________ > For list-related administrative tasks: > http://lists.racket-lang.org/listinfo/dev > _________________________________________________ For list-related administrative tasks: http://lists.racket-lang.org/listinfo/dev

