I've pushed a change to Racket's macro system that throws out the syntax-certificate system and adds a syntax-taint system.
Syntax taints, like syntax certificates before, are intended to protect macro expansions from abuse. "Abuse" means using `local-expand' to extract a piece of an expansion, then putting the piece in a new context or using `datum->syntax' to access unexported module bindings using the piece's lexical context. Meanwhile, program-processing tools like `errortrace' or languages like `typed/racket' are supposed to pull apart expansions and reorganize them; code inspectors remain the way to distinguish trusted tools and languages from potential abusers. Things you need to know: * When writing a macro (without `syntax-rules', `define-syntax-rule', or `syntax-id-rules'), apply `syntax-protect' to the macro's result syntax object to protect it from abuse. The `syntax-rules', etc., forms use `syntax-protect' automatically. * If you write program-processing tools or languages that use `syntax-recertify', you'll need to change them to use `syntax-disarm' and possibly `syntax-rearm'. More Background --------------- Combining extensibility with security is a difficult problem in most any domain. Macros are an important tool for language extensibility, and `local-expand' enables an important class of language extensions in Racket, but `local-expand' also opens the door to macro bindings that break abstractions --- reaching into a macro expansion to access bindings that are intended to be private. Syntax certificates were designed to protect private bindings in the presence of `local-expand'. The idea was that a macro expansion can be certified as originating from a particular module, and the use of unexported identifiers would be allowed based on that certification. Pulling apart a syntax object loses certificates, so unexported bindings cannot be extracted and used in contexts which the defining module would not approve. Certificates didn't work well enough: * The distinction between unexported bindings and other bindings meant that certificate-maintenance problems were often detected late, since detecting problems required an example that spans module boundaries and involves unexported identifier. * Since certificates are module-specific, identifiers introduced in a macro expansion must be distinguished from those present in a macro use; certificates thus get tangled up in the macro system's hygiene machinery. Macro-generating macros create particular complexities when the generator macro lives in one module and the generated macro lives in another module. The system of "lifting" certificates and "inactive" versus "active" certificates, which evolved to address this problem, was difficult to reason about. In the case of recent reported problems with certificates, it's still not clear whether the certificate system was buggy or misused. * We found two holes in the certificate system: - Grabbing a macro transformer by `syntax-local-value' and applying it directly could provide access to identifiers with "inactive" certificates. The inactive certificates could then be lifted to any use context of the macro, defeating the intended protection of certificates. - Using `local-expand' on method names within a class could expose locally bound variables that were intended as private (i.e., they had no protection because they were not module-bound variables). Ryan and I have been looking for a better solution for a long time. I think Ryan has solved the problem with the design of the taint system. Proving that syntax taints provide security as intended remains a steep challenge, but the design closes the known hole with certificates, and our intuition is that they should work better in practice: * Taints apply uniformly to all identifiers, including local identifiers, identifiers in the same module, exported identifiers, and unexported identifiers. Taint-management problems in syntax tools should therefore become apparent more readily. * Taints are not module-specific, which detangles them from hygiene machinery and avoids many other design and implementation problems. One indicator of this improvement is that the total size of ".zo" files in the Racket installation is 20% smaller with taints in place of certificates. (DrRacket's footprint is about 5% smaller, and start-up time for `racket' is slightly shorter. Interestingly, `raco setup' time is not significantly affected --- perhaps because certificate tracking had been such a bottleneck in the past that I spent a lot of time making it fast.) The "syntax.c" file is 1000 lines shorter. The trigger for a taint (a "dye pack" --- see below) is inspector-specific, so macro-generating macros still require some care in case the generator and the generated macro are under different code inspectors; this extra work is still easier to reason about than inactive certificates. * The two known holes are fixed: - Using `syntax-local-value' to access a transformer directly no longer circumvents the system, because the transformer itself arms a syntax object, rather than making protection the responsibility of the macro expander that invokes the transformer. To some degree, this change also allows a macro implementor to opt out of the taint system. A drawback of the change is that a macro implementor must explicitly opt in to protection of its enclosing module --- at least when not using `syntax-rules', `define-syntax-rule', or `syntax-id-rules', all of which arm the resulting syntax object automatically. This drawback is similar to requiring module implementors to use `protect-out' on identifiers that should not be accessible to untrusted modules (so it's not a new problem, in some sense). - Taints protect all variables, including the private local bindings in the expansion of `class'. Overall, Ryan and I expect that ease of reasoning will be the main advantage of traits over certificates. Taints API Overview ------------------- "Tainted" identifiers cannot be compiled or expanded in either binding or use positions. A syntax object is not usually tainted directly directly. Instead, a "dye pack" is added to a syntax object with `syntax-arm', which produces an "armed" syntax object; if `stx' is armed, then `(syntax-e stx)' or `(datum->syntax stx ....)' produces a tainted syntax object. A syntax object can be "disarmed" using a suitable code inspector. (syntax-arm stx [insp use-mode?]) -> syntax? stx : syntax? insp : (or/c inspector? #f) use-mode? : any/c = #f Arms `new-stx' with a dye pack that is controlled by `insp'. If `insp' is #f, then during the dynamic extent of a transformer, it is replaced with the instantiation-time inspector of the module where a macro transformer is defined; if no transformation is in progress, `(current-code-inspector)' is used. If `use-mode?' is a true value, then a 'taint-mode property on `stx' can push dye packs to nested syntax objects, and certain forms like `begin' and `define-values' also push the dye packs down. This is the same as the way that 'certify-mode and the macro expander worked before, and 'certify-mode is supported as an alias for 'taint-mode for backward compatibility. If a 'taint-mode specifies 'transparent for a syntax object, then the syntax object's lexical information is removed as the dye pack is pushed to its children, so no lexical information is left unprotected. (syntax-disarm stx [insp]) stx : syntax? insp : (or/c inspector? #f) = #f Removes immediate dye packs (not taints) from `stx' (not from nested syntax objects in `stx') for which `insp' is powerful enough. (syntax-rearm stx from-stx use-mode?) stx : syntax? from-stx : syntax? use-mode? : any/c = #f Copies immediate dye packs and taints from `from-stx' to `stx'. If `use-mode?' is true, the copies are pushed to internal syntax object, as in `syntax-arm'. (syntax-protect stx) -> syntax? stx : syntax? An alias for `(syntax-arm stx #f #t)'. A macro typically uses this function to protect its result; it's built into `syntax-rules', `syntax-id-rules', and `define-syntax-rule'. (syntax-tainted? stx) -> boolean? stx : syntax? Returns #t if `stx' is tainted, #f otherwise. (syntax-taint stx) -> syntax? stx : syntax? Equivalent to `(datum->syntax (syntax-arm stx) (syntax-e stx))', or just `stx' if it is already tainted. The macro expander disarms a syntax object (using an all-powerful inspector) before it applies a macro transformer to the syntax object. The macro expander rearms the result of a transformer using the original syntax object (i.e., any dye packs that were stripped from the transfomer input are copied to the transformer output). The 'taint-mode (or 'certify-mode, for compatibility) property is used for rearming. Only expressions and `#%module-begin' forms should be armed. In particular, the macro expander disarms only expressions and module bodies before pulling them apart, and so other tools only need to worry about taint armings at those levels. See the Guide section 16.2.5 for more information. _________________________________________________ For list-related administrative tasks: http://lists.racket-lang.org/listinfo/dev