This note is prompted by work in a parallel project, Amber, on the implementation record types, but is properly a JVM question about JSR 292 functionality. Since we’ve got a quorum of experts here, and since we briefly raised the topic this morning on a Zoom chat, I’ll raise the question here of ClassValue performance. I’m BCC-ing amber-spec-experts so they know we are takling about this. (In fact the EGs overlap.)
JSR 292 introduced ClassValue as a hook for libraries (especially dynamic language implementations) to efficiently store library specific metadata on JVM classes. A general use case envisioned was to store method handles (or tuples of them) on classes, where a lazy link step (tied to the semantics of ClassValue::get) would materialize the required M’s as needed. A specific use case was to be able to create extensible v-table-like structures, where a CV would embody a v-table position, and each CV::get binding would embody a filled slot at that v-table position, for a particular class. The assumption was that dynamic languages using CV would continue to use the JVM’s built-in class mechanism for part or all of their own types, and also that it would be helpful for a dynamic language to adjoin metadata to system classes like java.lang.String. Both tactics have been used in the field. In the future, template classes may provide an even richer substrate for the types of non-Java languages. JSR 292 was envisioned for dynamic languages, but was built according to the inherent capabilities of the JVM, and so eventually (actually, in the next release!) it has been used for Java language implementations as well (indy for lambda). ClassValue has not yet been used to implement Java language features, but I believe the time may have come to do so. The general use case I have in mind is an efficient translation strategy for generic algorithms, where the genericity is in the receiver type. The specific use case is the default toString method of records (and also the equals and hashCode methods). The logic of this method is generic over the receiver type. For each record type (unless that record type overrides its toString method in source code), the toString method is defined to iterate over the fields of the record type, and produce a printed representation that mentions both the names and values of the fields. The name of the record’s class is also mentioned. If you ask an intermediate Java coder for an implementation of this spec., you will get something resembling an interpreter which walks over the metadata of “this.getClass()” and collects the necessary strings into a string builder. If you then deliver this code to users, after about a microsecond you will get complaints about its performance. We’re old hands who don’t fall for such traps, so we asked an experienced coder for better code. That code runs the interpreter-like logic once per distinct record type, collecting the distinct field accesses and folding up the string concatenations into a spongy mass of method handles, depositing the result in a cache. That’s better! (Programming with method handles is, alas, not an improvement over source code. Java hasn’t found its best self yet for doing partial evaluation algorithms, though there is good work out there, like Truffle.) In order not to have bad performance numbers, we are also preconditioning the v-table slot for each record’s toString method, as follows: 0. If the record already has a source-code definition, do nothing special. 1. Otherwise, synthesize a synthetic override method to Object::toString which contains a single indy instruction. (There is also data movement via aload and return.) 2. Set up the indy to run the fancy partial MH-builder mentioned above, the first time, and use the cached MH the second time. 3. Profit. In essence, toString works like a generic algorithm, where the generic type parameter is the receiver type. (If we had template methods we’d have another route to take but not today…) This works great. But there’s a flaw, because it doesn’t use ClassValue. As far as I can tell, it would be better for the translation strategy to *not* generate synthetic methods, but instead to put steps 1. and 2. above into a plain old Java method called Record::toString. This method would call x=this.getClass() and then y=R_TOSTRING.get(x) and then y.invokeExact(this). Non-use of CV is not the flaw, it’s the cause of the flaw. The flaw is apparent if you read the javadoc for Record::toString. It doesn’t say there’s a method there (because there isn’t) but it says weaselly stuff about “the default method provided does this and that”. In a purely dynamic OOL, the default method is just method bound to Record::toString, and it’s active as long as nobody overrides it (or calls super.toString). People spend years learning to reason about overrides in OOLs like Java, and we should cater to that. We could in this case, but we don’t, because we are pulling a non-OOL trick under the covers, and we have to be honest about it in the Javadoc. So there’s a concern with CV (though I don’t think an overriding one) that we don’t get to step 3 and profit, because the lookups of x and y appear to be interpreter-like overheads. Won’t record types suffer in performance by having those extra indirections happen every time toString (or equals or hashCode) is called? (This problem isn’t unique to Records, but Records are an early case of this sort of problem, of the need for link-time optimization of inheritable OO methods. If you look around you might find similar opportunities with interfaces and default methods.) This is where CV has to get up out of its chair and make itself useful. I think the JVM should take three steps, two sooner and the other later, and both without changing any public API points. 1. Encourage the JIT to constant-fold through ClassValue::get. This would fold up the proposed Record::toString method at all points where the type of the receiver record is known to the JIT. (That’s most places.) 2. Ensure that, if the operand to CV::get is not constant, we get good code anyway. (This is already true, probably.) Look for any small optimization cleanups getting through CV::get and on into MH::invokeExact. 3. Later on, consider v-table slot splitting in response to polymorphic methods which perform CV::get on their receiver. In general, v-table slot splitting is the practice of installing differently compiled code in different v-table slots of the same method. It can make sense if the JIT can do different jobs optimizing the same code on different classes of receivers. It’s usually a heroic hand optimization, but can also be done by the JVM. One more item, not directly related to CV’s but related to the above optimizations: 4. We should invest in one or more auto-bridging features in the JVM, where a call site (such as MyRecord::toString) can be rerouted through an intermediate step before it gets to the built-in target mandated by the JVMS (such as Object::toString or Record::toString), and can also be routed somewhere even if the supposed target method doesn’t even exist. Perhaps the target method symbolic reference is Foo::equals(int) and statically matching method is Foo::equals(Object); normally the static compiler puts in an auto-boxing step to fix the descriptor but there are reasons to consider a more dynamic bridging solution. Such a rerouting decision would be very naturally cached in v-table slots, obviating some or all of step 3 above. In the presence of feature #4, we might rewrite Record::toString to (somehow) advertise that it had no regular method body, but that it would be very happy to bridge any and all calls, using some advertised BSM, and decoupling the implementation from ClassValue. This implementation decision could be hidden from the user (and the Javadoc), but only if we did the ClassValue trick today, so we could advertise Record::toString as a regular old object-oriented method (with clever optimizations inside its implementation, natch). So, let’s take ClassValue off the bench, and start warming up Bridge-O-Matic. — John