Re: RFR: 8294982: Implementation of Classfile API [v20]

Maurizio Cimadamore Thu, 16 Feb 2023 04:59:26 -0800

On Thu, 16 Feb 2023 10:41:33 GMT, Adam Sotona <asot...@openjdk.org> wrote:


>> This is root pull request with Classfile API implementation, tests and 
>> benchmarks initial drop into JDK.
>> 
>> Following pull requests consolidating JDK class files parsing, generating, 
>> and transforming 
>> ([JDK-8294957](https://bugs.openjdk.org/browse/JDK-8294957)) will chain to 
>> this one.
>> 
>> Classfile API development is tracked at:
>> https://github.com/openjdk/jdk-sandbox/tree/classfile-api-branch
>> 
>> Development branch of consolidated JDK class files parsing, generating, and 
>> transforming is at:
>> https://github.com/openjdk/jdk-sandbox/tree/classfile-api-dev-branch
>> 
>> Classfile API [JEP](https://bugs.openjdk.org/browse/JDK-8280389) and [online 
>> API 
>> documentation](https://htmlpreview.github.io/?https://raw.githubusercontent.com/openjdk/jdk-sandbox/classfile-api-javadoc-branch/doc/classfile-api/javadoc/java.base/jdk/internal/classfile/package-summary.html)
>>  is also available.
>> 
>> Please take you time to review this non-trivial JDK addition.
>> 
>> Thank you,
>> Adam
>
> Adam Sotona has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   added 4-byte Unicode text to Utf8EntryTest

src/java.base/share/classes/jdk/internal/classfile/impl/BoundAttribute.java 
line 49:

> 47:  * BoundAttribute
> 48:  */
> 49: public abstract sealed class BoundAttribute<T extends Attribute<T>>

Understanding checkpoint: there are two variant of most of the elements: an 
*unbound* form and a *bound* form. The former is what you get by constructing 
the element from scratch using a factory. The latter is what you get when the 
element is read from the classfile.

Since bound elements have a reference to real "bytes" in some classfile, this 
distinction allows the implementation to speed up the writing of the element 
contents to the class buffer - that is, writing an unbound element goes through 
a "slow path" which involves constant pool entry creation, etc. For bound 
elements, we can, at least in some cases (e.g. when the builder supports direct 
writing) just copy the underlying bytes to the target buffer.

src/java.base/share/classes/jdk/internal/classfile/impl/EntryMap.java line 30:

> 28:  * An open-chain multimap used to map nonzero hashes to indexes (of 
> either CP
> 29:  * elements or BSM entries).  Code transformed from public domain 
> implementation
> 30:  * 
> (http://java-performance.info/implementing-world-fastest-java-int-to-int-hash-map/).

Could not open this link - seems to redirect to main page

src/java.base/share/classes/jdk/internal/classfile/impl/EntryMap.java line 192:

> 190:         return (int)s;
> 191:     }
> 192: }

Watch for newlines

src/java.base/share/classes/jdk/internal/classfile/impl/SplitConstantPool.java 
line 83:

> 81:  * ConstantPool.
> 82:  */
> 83: public final class SplitConstantPool implements ConstantPoolBuilder {

Not sure the "Split" prefix carries any meaning? (perhaps a leftover from 
previous iterations?)

src/java.base/share/classes/jdk/internal/classfile/impl/SplitConstantPool.java 
line 217:

> 215:                 }
> 216:             };
> 217:             // Doing a full scan here yields fall-off-the-cliff 
> performance results,

Understanding checkpoint: the parent is a class reader, which holds some class 
bytes. The class reader is set up in a way that constant pool entries are read 
"on-demand" - that is, if I ask the class reader "give me CP entry 42", the 
classreader will only build that entry (assuming 42 is a valid index) and give 
me back that entry. Also, the class reader will save the read constant in an 
internal array, to avoid re-reading the classfile bytes multiple times.

So, what this code here does is: when we create an entry map, we populate the 
map with the entries from the parent reader - but we only add entries that were 
already looked up (e.g. we do not bother reading _all_ entries, and adding all 
of them to the entry map).

But, when `findEntry` is called, if we see that we can't find the entry, and we 
know that there might be constant pool entries from the parent reader still 
unread, we force a full scan of the parent reader pool and try again.

Correct?

src/java.base/share/classes/jdk/internal/classfile/impl/SplitConstantPool.java 
line 271:

> 269: 
> 270:     private <E extends PoolEntry> E internalAdd(E cpi, int hash) {
> 271:         int newIndex = size-parentSize;

I'm not sure I get this. The new entry is constructed with an index set to the 
pool size. That said, when we build the entry we don't yet know if the entry is 
already in the pool. The EntryMap data structure seems to have logic to detect 
duplicates (e.g. adding an already added entry is a no-op) - but this method 
seems to (a) add the CP entry to the `newEntries` array anyway and (b) happily 
return the entry with the index set to the pool size (which might, or might 
not, be the correct final index). What am I missing?

src/java.base/share/classes/jdk/internal/classfile/impl/TargetInfoImpl.java 
line 34:

> 32: 
> 33: /**
> 34:  *

Empty javadoc?

src/java.base/share/classes/jdk/internal/classfile/impl/TemporaryConstantPool.java
 line 60:

> 58:     private static final Options options = new 
> Options(Collections.emptyList());
> 59: 
> 60:     private TemporaryConstantPool() {};

If I understand correctly, this constant pool is just "fake" in the sense that 
it's a builder which creates constant pool entries w/o storing them anywhere. 
This seems to be used in places where e.g. we need to convert a ClassDesc to a 
ClassEntry, probably so that the implementation of e.g. attributes can be 
defined in terms of constant pool entries, even when "unbounded". E.g. this is 
a trick which avoids having completely different representations for bound and 
unbound elements - correct?

-------------

PR: https://git.openjdk.org/jdk/pull/10982

Re: RFR: 8294982: Implementation of Classfile API [v20]

Reply via email to