Hi Aman, Even though the specification says "not in any particular order," the getInterfaces and getMethods actually return an ordered array, in the order these methods/interfaces are declared in their class files.
I believe you are decompiling the proxy classes generated by an older version of the JDK; for example, back in JDK 8, the proxy methods were not ordered because they were tracked in a HashMap: https://github.com/openjdk/jdk8u/blob/6b53212ef78ad50f9eede829c5ff87cadcdb434b/jdk/src/share/classes/sun/misc/ProxyGenerator.java#L405 Which is no longer the case: https://github.com/openjdk/jdk/blob/d59c12fe1041a1f61f68408241a9aa4d96ac4fd2/src/java.base/share/classes/java/lang/reflect/ProxyGenerator.java#L241 - Chen On Wed, May 22, 2024 at 1:19 PM Aman Sharma <aman...@kth.se> wrote: > Hi, > > > Another thing I wanted to look into in this thread was the order of fields > in the Proxy classes generated. They are also based on the a number. The > same proxy classes across different executions can have random order of > `Method` fields and the methods could be mapped to different field names. > > > For example, consider the proxy class based on `picocli.CommandLine > <https://github.com/remkop/picocli/blob/da98db63d1b516141b7485881b0dcddfd082dbc8/src/main/java/picocli/CommandLine.java#L4541>` > in two different executions. > > // fields and method are truncated for brevity > public final class $Proxy9 extends Proxy implements CommandLine.Command { > private static Method m1; > private static Method m32; > private static Method m21; > private static Method m43; > private static Method m36; > private static Method m27; > > public final boolean helpCommand() throws { > try { > return (Boolean)super.h.invoke(this, m32, (Object[])null); > } catch (RuntimeException | Error var2) { > throw var2; > } catch (Throwable var3) { > throw new UndeclaredThrowableException(var3); > } > } > > // fields and method are truncated for brevity > public final class $Proxy13 extends Proxy implements CommandLine.Command { > private static Method m1; > private static Method m29; > private static Method m16; > private static Method m40; > private static Method m38; > private static Method m12; > > public final boolean helpCommand() throws { > try { > return (Boolean)super.h.invoke(this, m29, (Object[])null); > } catch (RuntimeException | Error var2) { > throw var2; > } catch (Throwable var3) { > throw new UndeclaredThrowableException(var3); > } > } > > > Notice the difference in the order of fields and `helpCommand` method is > mapped to a different field name in both classes. This happens because > the method array returned by `getMethods` is not sorted in any particular > order > <https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Class.java#L2178> > when generating a proxy class. What dictates this order? And why is it > not deterministic? > > > Regards, > Aman Sharma > > PhD Student > KTH Royal Institute of Technology > School of Electrical Engineering and Computer Science (EECS) > Department of Theoretical Computer Science (TCS) > <http://www.kth.se> <https://www.kth.se/profile/amansha> > <https://www.kth.se/profile/amansha> > <https://www.kth.se/profile/amansha>https://algomaster99.github.io/ > ------------------------------ > *From:* Aman Sharma > *Sent:* Wednesday, May 22, 2024 4:12:19 PM > *To:* Chen Liang > *Cc:* David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org > *Subject:* Re: Deterministic naming of subclasses of > `java/lang/reflect/Proxy` > > > Hi Chen, > > > That's clear. Thanks for letting me know. I guess then Project Leyden is > working on naming the hidden classes deterministically to achieve their > goals <https://openjdk.org/projects/leyden/notes/01-beginnings>. > > > Regards, > Aman Sharma > > PhD Student > KTH Royal Institute of Technology > School of Electrical Engineering and Computer Science (EECS) > Department of Theoretical Computer Science (TCS) > <http://www.kth.se> <https://www.kth.se/profile/amansha> > <https://www.kth.se/profile/amansha> > <https://www.kth.se/profile/amansha>https://algomaster99.github.io/ > ------------------------------ > *From:* Chen Liang <liangchenb...@gmail.com> > *Sent:* Wednesday, May 22, 2024 1:35:46 PM > *To:* Aman Sharma > *Cc:* David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org > *Subject:* Re: Deterministic naming of subclasses of > `java/lang/reflect/Proxy` > > Hi Aman, > We have tried defining Proxy as hidden classes; a previous attempt was on > hold because of issues with serialization. Otherwise, Proxies work great as > hidden classes. > > Chen > > On Mon, May 20, 2024 at 7:56 AM Aman Sharma <aman...@kth.se> wrote: > >> Hi David, >> >> >> > I would not expect any class load >> events. >> >> >> I understand. I also haven't tried to intercept them but I see only one >> approach right now to include them in an allowlist - 1) statically look for >> invocations of "Lookup::defineHiddenClass". 2) Instrument them so that >> its first argument "bytes" can be looked into upon. I haven't looked into >> it much because I did not have much idea about it. And they are hidden so >> it made it worse. 😅 Thanks for sharing the JEP! >> >> >> > >> java.lang.reflect.Proxy could define hidden classes to act as the proxy >> classes which implement proxy interfaces; from JEP 317 >> >> >> It says that Proxy classes will also become hidden classes. Is it >> underway? Right now one can intercept, transform them, and include them in >> an allowlist. What do you think of naming them independent of AtomicLong so >> that a proxy class generated at runtime is easy to lookup in the allowlist? >> >> >> >> Regards, >> Aman Sharma >> >> PhD Student >> KTH Royal Institute of Technology >> School of Electrical Engineering and Computer Science (EECS) >> Department of Theoretical Computer Science (TCS) >> <http://www.kth.se> <https://www.kth.se/profile/amansha> >> <https://www.kth.se/profile/amansha> >> <https://www.kth.se/profile/amansha>https://algomaster99.github.io/ >> ------------------------------ >> *From:* David Holmes <david.hol...@oracle.com> >> *Sent:* Monday, May 20, 2024 2:30:37 PM >> *To:* Aman Sharma; liangchenb...@gmail.com >> *Cc:* core-libs-dev@openjdk.org; leyden-...@openjdk.org >> *Subject:* Re: Deterministic naming of subclasses of >> `java/lang/reflect/Proxy` >> >> On 20/05/2024 10:12 pm, Aman Sharma wrote: >> > Hi David, >> > >> > >> > > How did you try to intercept them? Hidden classes are not "loaded" in >> > the normal sense so won't trigger class load events. >> > >> > >> > I could not intercept them. I only see them when I pass >> `-verbose:class` >> > in the Java CLI. >> >> Yes that is why I asked how you tried to intercept them. >> >> > >> > I also couldn't intercept them using JVMTI Class File Load Hook >> > < >> https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassFileLoadHook> >> event. However JEP 371 suggests that it should be possible to intercept >> them using JVMTI Class Load < >> https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassLoad> >> event, but I won't have the bytecode at this stage. So is there no way to >> get its bytecode before it is linked and initialized in the JVM? >> >> Hidden classes are not loaded so I would not expect any class load >> events. However the exact nature of the JVMTI class load event is >> unclear as it talks about "class or interface creation" which is neither >> loading or defining per se. But a class prepare event sounds like it >> should be issued. However neither give you access to the bytecode of the >> class AFAICS. >> >> David >> ----- >> >> >> > >> > Regards, >> > Aman Sharma >> > >> > PhD Student >> > KTH Royal Institute of Technology >> > School of Electrical Engineering and Computer Science (EECS) >> > Department of Theoretical Computer Science (TCS) >> > < >> http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha >> > >> > <https://www.kth.se/profile/amansha>https://algomaster99.github.io/ >> > <https://algomaster99.github.io/> >> > ------------------------------------------------------------------------ >> > *From:* David Holmes <david.hol...@oracle.com> >> > *Sent:* Monday, May 20, 2024 2:59:17 AM >> > *To:* Aman Sharma; liangchenb...@gmail.com >> > *Cc:* core-libs-dev@openjdk.org; leyden-...@openjdk.org >> > *Subject:* Re: Deterministic naming of subclasses of >> > `java/lang/reflect/Proxy` >> > On 17/05/2024 9:43 pm, Aman Sharma wrote: >> >> Hi Chen, >> >> >> >> > java.lang.invoke.LambdaForm$MH/0x00000200cc000400 >> >> >> >> I do see this as output when I pass -verbose:class. However, based on >> my >> >> experiments, I have seen that neither an agent passed via 'javaagent' >> >> nor an agent passed via 'agentpath' is able to intercept this hidden >> class. >> > >> > How did you try to intercept them? Hidden classes are not "loaded" in >> > the normal sense so won't trigger class load events. >> > >> >> Also, I was a bit confused since I saw somewhere that the names of >> >> hidden classes are null. But thanks for clarifying here. >> > >> > The JEP clearly defines the name format for hidden classes - though the >> > final component is VM specific (and typically a hashcode). >> > >> > https://openjdk.org/jeps/371 <https://openjdk.org/jeps/371> >> > >> > Cheers, >> > David >> > ----- >> > >> >> > avoid dynamic class loading >> >> >> >> I don't see dynamic class loading as a problem. I only mind some >> >> unstable generation aspects of them which make it hard to verify them >> >> based on an allowlist. >> >> >> >> For example, if this hidden class is generated with the exact same >> name >> >> and the exact same bytecode during runtime as well, it would be easy >> to >> >> verify it. However, I do see the names are based on some sort of >> memory >> >> address so and I don't know what bytecode it has so I don't have >> >> suggestions to make them stable as of now. For Proxy classes, I feel >> it >> >> can be addressed unless you disagree or some involved in Project >> Leyden >> >> does. :) Thank you for forwarding my mail there. >> >> >> >> Regards, >> >> Aman Sharma >> >> >> >> PhD Student >> >> KTH Royal Institute of Technology >> >> https://algomaster99.github.io/ <https://algomaster99.github.io/> >> > <https://algomaster99.github.io/ <https://algomaster99.github.io/>> >> >> >> >> >> ------------------------------------------------------------------------ >> >> *From:* liangchenb...@gmail.com <liangchenb...@gmail.com> >> >> *Sent:* Friday, May 17, 2024 1:23:58 pm >> >> *To:* Aman Sharma <aman...@kth.se> >> >> *Cc:* core-libs-dev@openjdk.org <core-libs-dev@openjdk.org>; >> >> leyden-...@openjdk.org <leyden-...@openjdk.org> >> >> *Subject:* Re: Deterministic naming of subclasses of >> >> `java/lang/reflect/Proxy` >> >> >> >> Hi Aman, >> >> For `-verbose:class`, it's a JVM argument instead of a program >> argument; >> >> so when you run a java program like `java Main`, you should call it as >> >> `java -verbose:class Main`. >> >> When done correctly, you should see hidden class outputs like: >> >> [0.032s][info][class,load] >> >> java.lang.invoke.LambdaForm$MH/0x00000200cc000400 source: >> >> __JVM_LookupDefineClass__ >> >> The loading of java.lang.invoke hidden classes requires your program >> to >> >> use MethodHandle features, like a lambda. >> >> >> >> I think the problem you are exploring, that to avoid dynamic class >> >> loading and effectively turn Java Platform closed for security, is >> also >> >> being accomplished by project Leyden (as I've shared initially); Thus, >> I >> >> am forwarding this to leyden-dev instead, so you can see what approach >> >> Leyden uses to accomplish the same goal as yours. >> >> >> >> Regards, Chen Liang >> >> >> >> On Fri, May 17, 2024 at 4:40 AM Aman Sharma <aman...@kth.se >> >> <mailto:aman...@kth.se <mailto:aman...@kth.se <aman...@kth.se>>>> >> wrote: >> >> >> >> __ >> >> >> >> Hi Roger, >> >> >> >> >> >> Do you have ideas on how to intercept them? My javaagent is not >> able >> >> to nor a JVMTI agent passed using `agentpath` option. It also does >> >> not seem to show up in logs when I pass `-verbose:class`. >> >> >> >> >> >> Also, what do you think of renaming the proxy classes as suggested >> >> below? >> >> >> >> >> >> Regards, >> >> Aman Sharma >> >> >> >> PhD Student >> >> KTH Royal Institute of Technology >> >> School of Electrical Engineering and Computer Science (EECS) >> >> Department of Theoretical Computer Science (TCS) >> >> <http://www.kth.se><https://www.kth.se/profile/amansha>< >> https://www.kth.se/profile/amansha < >> http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha >> >> >> >> <https://www.kth.se/profile/amansha >> > <https://www.kth.se/profile/amansha>>https://algomaster99.github.io/ >> >> <https://algomaster99.github.io/ <https://algomaster99.github.io/ >> >> >> >> >> ------------------------------------------------------------------------ >> >> *From:* core-libs-dev <core-libs-dev-r...@openjdk.org >> >> <mailto:core-libs-dev-r...@openjdk.org >> > <mailto:core-libs-dev-r...@openjdk.org <core-libs-dev-r...@openjdk.org>>>> >> on behalf of Roger Riggs >> >> <roger.ri...@oracle.com <mailto:roger.ri...@oracle.com < >> mailto:roger.ri...@oracle.com <roger.ri...@oracle.com>>>> >> >> *Sent:* Friday, May 17, 2024 4:57:46 AM >> >> *To:* core-libs-dev@openjdk.org <mailto:core-libs-dev@openjdk.org >> <mailto:core-libs-dev@openjdk.org <core-libs-dev@openjdk.org>>> >> >> *Subject:* Re: Deterministic naming of subclasses of >> >> `java/lang/reflect/Proxy` >> >> Hi Aman, >> >> >> >> You may also run into hidden classes (JEP 371: Hidden Classes) that >> >> allow classes to be defined, at runtime, without names. >> >> It has been proposed to use them for generated proxies but that >> >> hasn't been implemented yet. >> >> There are benefits to having nameless classes, because they can't >> be >> >> referenced by name, only as a capability, they can be better >> >> encapsulated. >> >> >> >> fyi, Roger Riggs >> >> >> >> >> >> On 5/16/24 8:11 AM, Aman Sharma wrote: >> >>> >> >>> Hi, >> >>> >> >>> >> >>> Thanks for your response, Liang! >> >>> >> >>> >> >>> > I think you meant CVE-2021-42392 instead of 2022. >> >>> >> >>> >> >>> Sorry of the error. I indeed meant CVE-2021-42392 >> >>> <https://nvd.nist.gov/vuln/detail/cve-2021-42392 >> > <https://nvd.nist.gov/vuln/detail/cve-2021-42392>>. >> >>> >> >>> >> >>> > Leyden mainly avoids this unstable generation by performing a >> >>> training run to collect classes loaded >> >>> >> >>> >> >>> Would love to know the details of Project Leyden and how they >> >>> worked so far to focus on this goal. In our case, the training run >> >>> is the test suite. >> >>> >> >>> >> >>> > GeneratedConstructorAccessor is already retired by JEP 416 [2] >> >>> in Java 18 >> >>> >> >>> >> >>> I did see them not appearing in my allowlist when I ran my study >> >>> subject (Apache PDFBox) with Java 21. Thanks for letting me know >> >>> about this JEP. I see they are re-implemented with method handles. >> >>> >> >>> >> >>> > How are you checking the classes? >> >>> >> >>> >> >>> To detect runtime generated code, we have javaagent that is hooked >> >>> statically to the test suite execution. It gives us all classes >> >>> that that is loaded post the JVM and the javaagent are loaded. So >> >>> we only check the classes loaded for the purpose of running the >> >>> application. This is also why we did not choose -agentlib as it >> >>> would give classes for the setting up JVM and javaagent and we the >> >>> user of our tool must the classes they load. >> >>> >> >>> >> >>> Next, we have a `ClassFileTransformer` hook in the agent where we >> >>> produce the checksum using the bytecode. And we compare the >> >>> checksum with the one existing in the allowlist. The checksum >> >>> computation algorithm is same for both steps. Let me describe how >> >>> I compute the checksum. >> >>> >> >>> >> >>> 1. I get the CONSTANT_Class_info >> >>> < >> https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.1 >> < >> https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.1>> >> entry corresponding to `this_class` and rewrite the CONSTANT_Utf8_info < >> https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.7 >> < >> https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.7>> >> corresponding to a fix String constant, say "foo". >> >>> 2. Since, the name of the class is used to refer to its types >> >>> members (fields/method), I get all CONSTANT_Fieldref_info >> >>> < >> https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.2 >> < >> https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.2>> >> and if its `class_index` corresponds to the old `this_class`, we rewrite >> the UTF8 value of class_index to the same constant "foo". >> >>> 3. Next, since the naming of the fields, in Proxy classes, are >> >>> also suffixed by numbers, for example, `private static Method >> >>> m4`, we rewrite the UTF8 value of name in the >> >>> CONSTANT_NameAndType_info >> >>> < >> https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.6 >> < >> https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.6 >> >>. >> >>> 4. These fields can also have a random order so we simply sort >> >>> the entire byte code using `Arrays.sort(byte[])` to eliminate >> >>> any differences due to ordering of fields/methods. >> >>> 5. Simply sorting the byte array still had minute differences. I >> >>> could not understand why they existed even though values in >> >>> constant pool of the bytecode in allowlist and at runtime were >> >>> exactly the same after rewriting. The differences existed in >> >>> the bytes of the Code attribute of methods. I concluded that >> >>> the bytes stored some position information. To avoid this, I >> >>> created a subarray where I considered the bytes corresponding >> >>> to `CONSTANT_Utf8_info.bytes` only. Computing a checksum for >> >>> it resulted in the same checksums for both classfiles. >> >>> >> >>> >> >>> Let's understand the whole approach with an example of Proxy >> class. >> >>> >> >>> ` >> >>> public final class $Proxy42 extends Proxy implements >> org.apache.logging.log4j.core.config.plugins.Plugin { >> >>> ` >> >>> >> >>> The will go in the allowlist as "Proxy_Plugin: <SHA256 checksum>". >> >>> >> >>> When the same class is intercepted at runtime, say "$Proxy10", we >> >>> look for "Proxy_Plugin" in the allowlist and since the checksum >> >>> algorithm is same in both cases, we get a match and let the class >> >>> load. >> >>> >> >>> This approach has seemed to work well for Proxy classes, Generated >> >>> Constructor Accessor (which is removed as you said). I also looked >> >>> at the species generated by method handles. I did not notice any >> >>> modification in them. Their name generation seemed okay to me. If >> >>> some new Species are generated, it is of course detected since it >> >>> is not in the allowlist. >> >>> >> >>> I have not looked into LambdaMetafactory because I did not >> >>> encounter it as a problem so far, but I am aware its name >> >>> generation is also unstable. I have run my approach only a few >> >>> projects only. And for hidden classes, I assume the the agent >> >>> won't be able to intercept them so detecting them would be really >> >>> hard. >> >>> >> >>> >> >>> Regards, >> >>> Aman Sharma >> >>> >> >>> PhD Student >> >>> KTH Royal Institute of Technology >> >>> School of Electrical Engineering and Computer Science (EECS) >> >>> Department of Theoretical Computer Science (TCS) >> >>> <https://www.kth.se/profile/amansha >> > <https://www.kth.se/profile/amansha>>https://algomaster99.github.io/ >> > <https://algomaster99.github.io/ <https://algomaster99.github.io/>> >> >>> >> ------------------------------------------------------------------------ >> >>> *From:* liangchenb...@gmail.com <mailto:liangchenb...@gmail.com < >> mailto:liangchenb...@gmail.com <liangchenb...@gmail.com>>> >> >>> <liangchenb...@gmail.com> <mailto:liangchenb...@gmail.com < >> mailto:liangchenb...@gmail.com <liangchenb...@gmail.com>>> >> >>> *Sent:* Thursday, May 16, 2024 5:52:03 AM >> >>> *To:* Aman Sharma; core-libs-dev >> >>> *Cc:* Martin Monperrus >> >>> *Subject:* Re: Deterministic naming of subclasses of >> >>> `java/lang/reflect/Proxy` >> >>> Hi Aman, >> >>> I think you meant CVE-2021-42392 instead of 2022. >> >>> >> >>> For your approach of an "allowlist" for Java runtime, project >> >>> Leyden is looking to generate a static image [1], that >> >>> > At run time it cannot load classes from outside the image, nor >> >>> can it create classes dynamically. >> >>> Leyden mainly avoids this unstable generation by performing a >> >>> training run to collect classes loaded and even object graphs; I >> >>> am not familiar with the details unfortunately. >> >>> >> >>> Otherwise, the Proxy discussion belongs better to core-libs-dev, >> >>> as java.lang.reflect.Proxy is part of Java's core libraries. I am >> >>> replying this thread to core-libs-dev. >> >>> >> >>> For your perceived problem that classes don't have unique names, >> >>> your description sounds dubious: GeneratedConstructorAccessor is >> >>> already retired by JEP 416 [2] in Java 18, and there are many >> >>> other cases in which JDK generates classes without stable names, >> >>> notoriously LambdaMetafactory (Gradle wished for cacheable >> >>> Lambdas); the same applies for the generated classes for >> >>> MethodHandle's LambdaForms (which carries implementation code for >> >>> LambdaForm). How are you checking the classes? It seems you are >> >>> not checking hidden classes. Proxy and Lambda classes are defined >> >>> by the caller's class loader, while LambdaForms are under JDK's >> >>> system class loader I think. We need to ensure you are correctly >> >>> finding all unstable classes before we can proceed. >> >>> >> >>> [1]: https://openjdk.org/projects/leyden/notes/01-beginnings >> > <https://openjdk.org/projects/leyden/notes/01-beginnings> >> >>> <https://openjdk.org/projects/leyden/notes/01-beginnings >> > <https://openjdk.org/projects/leyden/notes/01-beginnings>> >> >>> [2]: https://openjdk.org/jeps/416 <https://openjdk.org/jeps/416> >> > <https://openjdk.org/jeps/416 <https://openjdk.org/jeps/416>> >> >>> >> >>> On Wed, May 15, 2024 at 7:00 PM Aman Sharma <aman...@kth.se >> >>> <mailto:aman...@kth.se <mailto:aman...@kth.se <aman...@kth.se>>>> >> wrote: >> >>> >> >>> Hi, >> >>> >> >>> >> >>> My name is Aman and I am a PhD student at KTH Royal Institute >> >>> of Technology, Stockholm, Sweden. I research as part of CHAINS >> >>> <https://chains.proj.kth.se/ <https://chains.proj.kth.se/>> >> project to >> > strengthen the >> >>> software supply chain of multiple ecosystem. I particularly >> >>> focus on runtime integrity in Java. In this email, I want to >> >>> write about an issue I have discovered with /dynamic >> >>> generation of `java.lang.reflect.Proxy`classes/. I will >> >>> propose a solution and would love to hear the feedback from >> >>> the community. Let me know if this is the correct mailing-list >> >>> for such discussions. It seemed the most relevant from this >> >>> list <https://mail.openjdk.org/mailman/listinfo >> > <https://mail.openjdk.org/mailman/listinfo>>. >> >>> >> >>> >> >>> *My research* >> >>> >> >>> * >> >>> * >> >>> >> >>> Java has features to load class on the fly - it can either >> >>> download or generate a class at runtime. These features are >> >>> useful for inner workings of JDK. For example, implementing >> >>> annotations, reflective access, etc. However, these features >> >>> have also contributed to critical vulnerabilities in the past >> >>> - CVE-2021-44228 (log4shell), CVE-2022-33980, CVE-2022-42392. >> >>> All of these vulnerabilities have one thing in common - /a >> >>> class that was not known during build time was >> >>> downloaded/generated at runtime and loaded into JVM./ >> >>> >> >>> >> >>> To defend against such vulnerabilities, we propose a solution >> >>> to /allowlist classes for runtime/. This allowlist will >> >>> contain an exhaustive list of classes that can be loaded by >> >>> the JVM and it will be enforced at runtime. We build this >> >>> allowlist from three sources: >> >>> >> >>> 1. All classes of all modules provided by the Java Standard >> >>> Library. We use ClassGraph >> >>> <https://github.com/classgraph/classgraph >> > <https://github.com/classgraph/classgraph>> to scan the JDK. >> >>> 2. We can take the source code and all dependencies of an >> >>> application. We use a software bill of materials to get >> >>> all the data. >> >>> 3. Finally, we use run the test suite to include any runtime >> >>> downloaded/generated classes. >> >>> >> >>> Such a list is able to prevent the above 3 CVEs because it >> >>> does not let the "unknown" bytecode to be loaded. >> >>> >> >>> *Problem with generating such an allowlist* >> >>> * >> >>> * >> >>> The first two parts of the allowlist are easy to get. The >> >>> problem is with the third step where we want to allowlist all >> >>> the classes that could be downloaded or generated. Upon >> >>> running the test suite and hooking to the classes it loads, we >> >>> observer that the list consists of classes that are called >> >>> "com/sun/proxy/$Proxy2", >> >>> "jdk/internal/reflect/GeneratedConstructorAccessor3" among >> >>> many more. The purpose of these classes can be identifed. The >> >>> proxy class is created for to implement an annotation. The >> >>> accessor gives access to constructor of a class to the JVM. >> >>> >> >>> When enforcing this allowlist at runtime, we see that the >> >>> bytecode content for "com/sun/proxy/$Proxy2" differs in the >> >>> allowlist and at runtime. In our case, we we are experimenting >> >>> with pdfbox <https://github.com/apache/pdfbox < >> https://github.com/apache/pdfbox>> so >> > we created >> >>> the allowlist using its test suite. Then we enforced this >> >>> allowlist while running some of its subcommands. However, >> >>> there was some other proxy class say "com/sun/proxy/$Proxy5" >> >>> at runtime that implemented the same interfaces and had the >> >>> same methods as "com/sun/proxy/$Proxy2" in the allowlist. They >> >>> only differed in the name of the class, order of fields, and >> >>> types for fields references. This could happen because the >> >>> order of the loading of class is workload dependent, but it >> >>> causes problem to generate such an allowlist. >> >>> >> >>> *Solution >> >>> * >> >>> >> >>> >> >>> We propose that naming of subclasses of >> >>> "java/lang/reflect/Proxy" should not be dependent upon the >> >>> order of loading. In order to do so, two issues can be fixed: >> >>> >> >>> 1. The naming of the class should not be based on AtomicLong >> >>> < >> https://github.com/openjdk/jdk/blob/b687aa550837830b38f0f0faa69c353b1e85219c/src/java.base/share/classes/java/lang/reflect/Proxy.java#L531 >> < >> https://github.com/openjdk/jdk/blob/b687aa550837830b38f0f0faa69c353b1e85219c/src/java.base/share/classes/java/lang/reflect/Proxy.java#L531>>. >> Rather it could be named based on the interfaces it implements. I also >> wonder why AtomicLong is chosen in the first place. >> >>> 2. Methods of the interfaces must be in a particular order. >> >>> Right now, they are not sorted in any particular order >> >>> < >> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Class.java#L2178 >> < >> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Class.java#L2178 >> >>. >> >>> >> >>> >> >>> These fixes will make proxy class generation deterministic >> >>> with respect to order of loading and won't be flagged at >> >>> runtime since the test suite would already detect them. >> >>> >> >>> I would love to hear from the community about these ideas. If >> >>> in agreement, I would be happy to produce a patch. I have >> >>> discovered this issue with subclasses of >> >>> GeneratedConstructorAccessor >> >>> < >> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/reflect/ConstructorAccessor.java >> < >> https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/reflect/ConstructorAccessor.java>> >> as well and I imagine it will also apply to some other runtime generated >> classes. If you disagree, please let me know also. It helps with my >> research. >> >>> >> >>> I also have PoCs for the above CVEs >> >>> <https://github.com/chains-project/exploits-for-sbom.exe >> > <https://github.com/chains-project/exploits-for-sbom.exe>> and >> >>> a proof concept tool is being developed under the name >> >>> sbom.exe <https://github.com/chains-project/sbom.exe >> > <https://github.com/chains-project/sbom.exe>> in case >> >>> any one wonders about the implementation. I would also be >> >>> happy to explain more. >> >>> >> >>> Regards, >> >>> Aman Sharma >> >>> >> >>> PhD Student >> >>> KTH Royal Institute of Technology >> >>> School of Electrical Engineering and Computer Science (EECS) >> >>> Department of Theoretical Computer Science (TCS) >> >>> <https://www.kth.se/profile/amansha >> > <https://www.kth.se/profile/amansha>>https://algomaster99.github.io/ >> > <https://algomaster99.github.io/ <https://algomaster99.github.io/>> >> >>> >> >> >> >> >> >