My branch with immediately represented characters is available at: https://github.com/97jaz/racket/tree/immediate-chars
I'm interested to hear any opinions from the members of this list on the implementation. For my own part, I'm ambivalent on the matter of actually incorporating this work into the official tree. Benchmarks that are heavy on character manipulation -- and there are few enough of these -- benefit from this change, to varying degrees. A few micro-benchmarks 1. Constructing a list of 10,000,000 characters using integer->char on integers chosen randomly from [0, 256) (average of 5 runs, CPU time): immediate-chars: 4248.8 Racket v5.3.4.10: 4297.2 2. Constructing a list of 10,000,000 characters using integer->char on integers chosen randomly from the entire field of valid Unicode code points (average of 5 runs, CPU time): immediate-chars: 4441.4 Racket v5.3.4.10: 5953.0 3. The 'wc' shootout benchmark: immediate-chars: 3789 Racket v5.3.4.10: 4155 Unsurprisingly, the difference between the two is more noticeable when a lot of characters outside of the first 256 are being used. The downside of this change is that a (Scheme_Object *) is now one of three things, rather than one of two. This sometimes requires an additional bit test in the interpreter and in the JIT. At any rate, I'm interested to know what people think. -Jon _________________________ Racket Developers list: http://lists.racket-lang.org/dev