Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-30 Thread Aman Sharma
Hi Joe,


> As a general comment, it is _not_ the goal of the API specification to
(over) specify exact behavior in cases like this.

> See as an example the discussion concerning behavioral compatibility
starting around slide 46 of

> "Contributing to OpenJDK: Participating in stewardship for the long-term,"
https://jcp.org/aboutJava/communityprocess/ec-public/materials/2023-06-13/Contributing_to_OpenJDK_2023_04_12.pdf


> This approach has evolved over the years and releases.

> In this case semantically, the array returned by getMethod is a set and
the no particular meaning should be read into the order of the elements.

> HTH,


> -Joe

Missed this email of yours. Thanks for making it clear.


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: Aman Sharma
Sent: Wednesday, May 22, 2024 8:19:41 PM
To: Chen Liang
Cc: David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`


Hi,


Another thing I wanted to look into in this thread was the order of fields in 
the Proxy classes generated. They are also based on the a number. The same 
proxy classes across different executions can have random order of `Method` 
fields and the methods could be mapped to different field names.


For example, consider the proxy class based on 
`picocli.CommandLine<https://github.com/remkop/picocli/blob/da98db63d1b516141b7485881b0dcddfd082dbc8/src/main/java/picocli/CommandLine.java#L4541>`
 in two different executions.

// fields and method are truncated for brevity
public final class $Proxy9 extends Proxy implements CommandLine.Command {
private static Method m1;
private static Method m32;
private static Method m21;
private static Method m43;
private static Method m36;
private static Method m27;

public final boolean helpCommand() throws  {
try {
return (Boolean)super.h.invoke(this, m32, (Object[])null);
} catch (RuntimeException | Error var2) {
throw var2;
} catch (Throwable var3) {
throw new UndeclaredThrowableException(var3);
}
 }

// fields and method are truncated for brevity
public final class $Proxy13 extends Proxy implements CommandLine.Command {
private static Method m1;
private static Method m29;
private static Method m16;
private static Method m40;
private static Method m38;
private static Method m12;

public final boolean helpCommand() throws  {
try {
return (Boolean)super.h.invoke(this, m29, (Object[])null);
} catch (RuntimeException | Error var2) {
throw var2;
} catch (Throwable var3) {
throw new UndeclaredThrowableException(var3);
}
}


Notice the difference in the order of fields and `helpCommand` method is mapped 
to a different field name in both classes. This happens because the method 
array returned by `getMethods` is not sorted in any particular 
order<https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Class.java#L2178>
 when generating a proxy class. What dictates this order? And why is it not 
deterministic?


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: Aman Sharma
Sent: Wednesday, May 22, 2024 4:12:19 PM
To: Chen Liang
Cc: David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`


Hi Chen,


That's clear. Thanks for letting me know. I guess then Project Leyden is 
working on naming the hidden classes deterministically to achieve their 
goals<https://openjdk.org/projects/leyden/notes/01-beginnings>.


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: Chen Liang 
Sent: Wednesday, May 22, 2024 1:35:46 PM
To: Aman Sharma
Cc: David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic namin

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-25 Thread Chen Liang
Did you reorder the field in picocli.CommandLine$Command?
In class 13 the usageHelpWidth was preceded by versionProvider, in class 9
it's preceded by requiredOptionMarker. It would also be helpful if you can
provide the bytecode for Command class, as it might be due to method
ordering in reflection from multiple inheritance (if Command extends
multiple interfaces), as you see some subsections of the methods (such as
from requiredOptionMarker to said footerHeading) are ordered.

Chen

On Sat, May 25, 2024 at 4:03 PM Aman Sharma  wrote:

> Hi Chen,
>
> I am attaching proxy classes generated in the JVM of OpenJDK 22. I,
> instead of decompiling, I disassembled them and I do see a difference. For
> example, see method `footerHeading` in both classes. In $Proxy9, it is
> mapped to `m39` field and in $Proxy13, it is mapped to `m21` field. What is
> the reason for this ordering? Why is mapping of methods to fields depend
> upon the execution?
>
>
> Regards,
> Aman Sharma
>
> PhD Student
> KTH Royal Institute of Technology
> School of Electrical Engineering and Computer Science (EECS)
> Department of Theoretical Computer Science (TCS)
> <https://www.kth.se/profile/amansha>https://algomaster99.github.io/
> --
> *From:* Chen Liang 
> *Sent:* Wednesday, May 22, 2024 9:37:16 PM
> *To:* Aman Sharma
> *Cc:* core-libs-dev@openjdk.org; leyden-...@openjdk.org
> *Subject:* Re: Deterministic naming of subclasses of
> `java/lang/reflect/Proxy`
>
> Hi Aman,
> Even though the specification says "not in any particular order," the
> getInterfaces and getMethods actually return an ordered array, in the order
> these methods/interfaces are declared in their class files.
>
> I believe you are decompiling the proxy classes generated by an older
> version of the JDK; for example, back in JDK 8, the proxy methods were not
> ordered because they were tracked in a HashMap:
> https://github.com/openjdk/jdk8u/blob/6b53212ef78ad50f9eede829c5ff87cadcdb434b/jdk/src/share/classes/sun/misc/ProxyGenerator.java#L405
> Which is no longer the case:
> https://github.com/openjdk/jdk/blob/d59c12fe1041a1f61f68408241a9aa4d96ac4fd2/src/java.base/share/classes/java/lang/reflect/ProxyGenerator.java#L241
>
> - Chen
>
> On Wed, May 22, 2024 at 1:19 PM Aman Sharma  wrote:
>
>> Hi,
>>
>>
>> Another thing I wanted to look into in this thread was the order of
>> fields in the Proxy classes generated. They are also based on the a
>> number. The same proxy classes across different executions can have random
>> order of `Method` fields and the methods could be mapped to different field
>> names.
>>
>>
>> For example, consider the proxy class based on `picocli.CommandLine
>> <https://github.com/remkop/picocli/blob/da98db63d1b516141b7485881b0dcddfd082dbc8/src/main/java/picocli/CommandLine.java#L4541>`
>> in two different executions.
>>
>> // fields and method are truncated for brevity
>> public final class $Proxy9 extends Proxy implements CommandLine.Command {
>> private static Method m1;
>> private static Method m32;
>> private static Method m21;
>> private static Method m43;
>> private static Method m36;
>> private static Method m27;
>>
>> public final boolean helpCommand() throws  {
>> try {
>> return (Boolean)super.h.invoke(this, m32, (Object[])null);
>> } catch (RuntimeException | Error var2) {
>> throw var2;
>> } catch (Throwable var3) {
>> throw new UndeclaredThrowableException(var3);
>> }
>>  }
>>
>> // fields and method are truncated for brevity
>> public final class $Proxy13 extends Proxy implements CommandLine.Command {
>> private static Method m1;
>> private static Method m29;
>> private static Method m16;
>> private static Method m40;
>> private static Method m38;
>> private static Method m12;
>>
>> public final boolean helpCommand() throws  {
>> try {
>> return (Boolean)super.h.invoke(this, m29, (Object[])null);
>> } catch (RuntimeException | Error var2) {
>> throw var2;
>> } catch (Throwable var3) {
>> throw new UndeclaredThrowableException(var3);
>> }
>> }
>>
>>
>> Notice the difference in the order of fields and `helpCommand` method is
>> mapped to a different field name in both classes. This happens because
>> the method array returned by `getMethods` is not sorted in any
>> particular order
>> <https://github.com/openjdk/jdk/blob/ma

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-22 Thread Joseph D. Darcy


On 5/22/2024 11:19 AM, Aman Sharma wrote:


Hi,


[snip]


Notice the difference in the order of fields and `helpCommand` method 
is mapped to a different field name in both classes. This happens 
because the method array returned by `getMethods` is not sorted in any 
particular order 
 
when generating a proxy class. What dictates this order? And why is it 
not deterministic?



Regards,
Aman Sharma



As a general comment, it is _not_ the goal of the API specification to 
(over) specify exact behavior in cases like this.


See as an example the discussion concerning behavioral compatibility 
starting around slide 46 of


"Contributing to OpenJDK: Participating in stewardship for the long-term,"
https://jcp.org/aboutJava/communityprocess/ec-public/materials/2023-06-13/Contributing_to_OpenJDK_2023_04_12.pdf


This approach has evolved over the years and releases.

In this case semantically, the array returned by getMethod is a set and 
the no particular meaning should be read into the order of the elements.


HTH,


-Joe


Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-22 Thread Chen Liang
Hi Aman,
Even though the specification says "not in any particular order," the
getInterfaces and getMethods actually return an ordered array, in the order
these methods/interfaces are declared in their class files.

I believe you are decompiling the proxy classes generated by an older
version of the JDK; for example, back in JDK 8, the proxy methods were not
ordered because they were tracked in a HashMap:
https://github.com/openjdk/jdk8u/blob/6b53212ef78ad50f9eede829c5ff87cadcdb434b/jdk/src/share/classes/sun/misc/ProxyGenerator.java#L405
Which is no longer the case:
https://github.com/openjdk/jdk/blob/d59c12fe1041a1f61f68408241a9aa4d96ac4fd2/src/java.base/share/classes/java/lang/reflect/ProxyGenerator.java#L241

- Chen

On Wed, May 22, 2024 at 1:19 PM Aman Sharma  wrote:

> Hi,
>
>
> Another thing I wanted to look into in this thread was the order of fields
> in the Proxy classes generated. They are also based on the a number. The
> same proxy classes across different executions can have random order of
> `Method` fields and the methods could be mapped to different field names.
>
>
> For example, consider the proxy class based on `picocli.CommandLine
> <https://github.com/remkop/picocli/blob/da98db63d1b516141b7485881b0dcddfd082dbc8/src/main/java/picocli/CommandLine.java#L4541>`
> in two different executions.
>
> // fields and method are truncated for brevity
> public final class $Proxy9 extends Proxy implements CommandLine.Command {
> private static Method m1;
> private static Method m32;
> private static Method m21;
> private static Method m43;
> private static Method m36;
> private static Method m27;
>
> public final boolean helpCommand() throws  {
> try {
> return (Boolean)super.h.invoke(this, m32, (Object[])null);
> } catch (RuntimeException | Error var2) {
> throw var2;
> } catch (Throwable var3) {
> throw new UndeclaredThrowableException(var3);
> }
>  }
>
> // fields and method are truncated for brevity
> public final class $Proxy13 extends Proxy implements CommandLine.Command {
> private static Method m1;
> private static Method m29;
> private static Method m16;
> private static Method m40;
> private static Method m38;
> private static Method m12;
>
> public final boolean helpCommand() throws  {
> try {
> return (Boolean)super.h.invoke(this, m29, (Object[])null);
> } catch (RuntimeException | Error var2) {
> throw var2;
> } catch (Throwable var3) {
> throw new UndeclaredThrowableException(var3);
> }
> }
>
>
> Notice the difference in the order of fields and `helpCommand` method is
> mapped to a different field name in both classes. This happens because
> the method array returned by `getMethods` is not sorted in any particular
> order
> <https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Class.java#L2178>
> when generating a proxy class. What dictates this order? And why is it
> not deterministic?
>
>
> Regards,
> Aman Sharma
>
> PhD Student
> KTH Royal Institute of Technology
> School of Electrical Engineering and Computer Science (EECS)
> Department of Theoretical Computer Science (TCS)
> <http://www.kth.se> <https://www.kth.se/profile/amansha>
> <https://www.kth.se/profile/amansha>
> <https://www.kth.se/profile/amansha>https://algomaster99.github.io/
> ------
> *From:* Aman Sharma
> *Sent:* Wednesday, May 22, 2024 4:12:19 PM
> *To:* Chen Liang
> *Cc:* David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
> *Subject:* Re: Deterministic naming of subclasses of
> `java/lang/reflect/Proxy`
>
>
> Hi Chen,
>
>
> That's clear. Thanks for letting me know. I guess then Project Leyden is
> working on naming the hidden classes deterministically to achieve their
> goals <https://openjdk.org/projects/leyden/notes/01-beginnings>.
>
>
> Regards,
> Aman Sharma
>
> PhD Student
> KTH Royal Institute of Technology
> School of Electrical Engineering and Computer Science (EECS)
> Department of Theoretical Computer Science (TCS)
> <http://www.kth.se> <https://www.kth.se/profile/amansha>
> <https://www.kth.se/profile/amansha>
> <https://www.kth.se/profile/amansha>https://algomaster99.github.io/
> --
> *From:* Chen Liang 
> *Sent:* Wednesday, May 22, 2024 1:35:46 PM
> *To:* Aman Sharma
> *Cc:* David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
> *Subject:* Re: Deterministic naming of subclasses of
> `java/lang/reflect/Proxy`
>
> 

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-22 Thread Aman Sharma
Hi,


Another thing I wanted to look into in this thread was the order of fields in 
the Proxy classes generated. They are also based on the a number. The same 
proxy classes across different executions can have random order of `Method` 
fields and the methods could be mapped to different field names.


For example, consider the proxy class based on 
`picocli.CommandLine<https://github.com/remkop/picocli/blob/da98db63d1b516141b7485881b0dcddfd082dbc8/src/main/java/picocli/CommandLine.java#L4541>`
 in two different executions.

// fields and method are truncated for brevity
public final class $Proxy9 extends Proxy implements CommandLine.Command {
private static Method m1;
private static Method m32;
private static Method m21;
private static Method m43;
private static Method m36;
private static Method m27;

public final boolean helpCommand() throws  {
try {
return (Boolean)super.h.invoke(this, m32, (Object[])null);
} catch (RuntimeException | Error var2) {
throw var2;
} catch (Throwable var3) {
throw new UndeclaredThrowableException(var3);
}
 }

// fields and method are truncated for brevity
public final class $Proxy13 extends Proxy implements CommandLine.Command {
private static Method m1;
private static Method m29;
private static Method m16;
private static Method m40;
private static Method m38;
private static Method m12;

public final boolean helpCommand() throws  {
try {
return (Boolean)super.h.invoke(this, m29, (Object[])null);
} catch (RuntimeException | Error var2) {
throw var2;
} catch (Throwable var3) {
throw new UndeclaredThrowableException(var3);
}
}


Notice the difference in the order of fields and `helpCommand` method is mapped 
to a different field name in both classes. This happens because the method 
array returned by `getMethods` is not sorted in any particular 
order<https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Class.java#L2178>
 when generating a proxy class. What dictates this order? And why is it not 
deterministic?


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: Aman Sharma
Sent: Wednesday, May 22, 2024 4:12:19 PM
To: Chen Liang
Cc: David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`


Hi Chen,


That's clear. Thanks for letting me know. I guess then Project Leyden is 
working on naming the hidden classes deterministically to achieve their 
goals<https://openjdk.org/projects/leyden/notes/01-beginnings>.


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: Chen Liang 
Sent: Wednesday, May 22, 2024 1:35:46 PM
To: Aman Sharma
Cc: David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

Hi Aman,
We have tried defining Proxy as hidden classes; a previous attempt was on hold 
because of issues with serialization. Otherwise, Proxies work great as hidden 
classes.

Chen

On Mon, May 20, 2024 at 7:56 AM Aman Sharma 
mailto:aman...@kth.se>> wrote:

Hi David,


> I would not expect any class load
events.


I understand. I also haven't tried to intercept them but I see only one 
approach right now to include them in an allowlist - 1) statically look for 
invocations of "Lookup::defineHiddenClass". 2) Instrument them so that its 
first argument "bytes" can be looked into upon. I haven't looked into it much 
because I did not have much idea about it. And they are hidden so it made it 
worse.  Thanks for sharing the JEP!


>

java.lang.reflect.Proxy could define hidden classes to act as the proxy classes 
which implement proxy interfaces; from JEP 317


It says that Proxy classes will also become hidden classes. Is it underway? 
Right now one can intercept, transform them, and include them in an allowlist. 
What do you think of naming them independent of AtomicLong so that a proxy 
class generated at runtime is easy to lookup in the allowlist?



Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and 

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-22 Thread Aman Sharma
Hi Chen,


That's clear. Thanks for letting me know. I guess then Project Leyden is 
working on naming the hidden classes deterministically to achieve their 
goals<https://openjdk.org/projects/leyden/notes/01-beginnings>.


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: Chen Liang 
Sent: Wednesday, May 22, 2024 1:35:46 PM
To: Aman Sharma
Cc: David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

Hi Aman,
We have tried defining Proxy as hidden classes; a previous attempt was on hold 
because of issues with serialization. Otherwise, Proxies work great as hidden 
classes.

Chen

On Mon, May 20, 2024 at 7:56 AM Aman Sharma 
mailto:aman...@kth.se>> wrote:

Hi David,


> I would not expect any class load
events.


I understand. I also haven't tried to intercept them but I see only one 
approach right now to include them in an allowlist - 1) statically look for 
invocations of "Lookup::defineHiddenClass". 2) Instrument them so that its 
first argument "bytes" can be looked into upon. I haven't looked into it much 
because I did not have much idea about it. And they are hidden so it made it 
worse.  Thanks for sharing the JEP!


>

java.lang.reflect.Proxy could define hidden classes to act as the proxy classes 
which implement proxy interfaces; from JEP 317


It says that Proxy classes will also become hidden classes. Is it underway? 
Right now one can intercept, transform them, and include them in an allowlist. 
What do you think of naming them independent of AtomicLong so that a proxy 
class generated at runtime is easy to lookup in the allowlist?



Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: David Holmes mailto:david.hol...@oracle.com>>
Sent: Monday, May 20, 2024 2:30:37 PM
To: Aman Sharma; liangchenb...@gmail.com<mailto:liangchenb...@gmail.com>
Cc: core-libs-dev@openjdk.org<mailto:core-libs-dev@openjdk.org>; 
leyden-...@openjdk.org<mailto:leyden-...@openjdk.org>
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

On 20/05/2024 10:12 pm, Aman Sharma wrote:
> Hi David,
>
>
>  > How did you try to intercept them? Hidden classes are not "loaded" in
> the normal sense so won't trigger class load events.
>
>
> I could not intercept them. I only see them when I pass `-verbose:class`
> in the Java CLI.

Yes that is why I asked how you tried to intercept them.

>
> I also couldn't intercept them using JVMTI Class File Load Hook
> <https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassFileLoadHook>
>  event. However JEP 371 suggests that it should be possible to intercept them 
> using JVMTI Class Load 
> <https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassLoad> 
> event, but I won't have the bytecode at this stage. So is there no way to get 
> its bytecode before it is linked and initialized in the JVM?

Hidden classes are not loaded so I would not expect any class load
events. However the exact nature of the JVMTI class load event is
unclear as it talks about "class or interface creation" which is neither
loading or defining per se. But a class prepare event sounds like it
should be issued. However neither give you access to the bytecode of the
class AFAICS.

David
-


>
> Regards,
> Aman Sharma
>
> PhD Student
> KTH Royal Institute of Technology
> School of Electrical Engineering and Computer Science (EECS)
> Department of Theoretical Computer Science (TCS)
> <http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
> <https://www.kth.se/profile/amansha>https://algomaster99.github.io/
> <https://algomaster99.github.io/>
> 
> *From:* David Holmes mailto:david.hol...@oracle.com>>
> *Sent:* Monday, May 20, 2024 2:59:17 AM
> *To:* Aman Sharma; liangchenb...@gmail.com<mailto:liangchenb...@gmail.com>
> *Cc:* core-libs-dev@openjdk.org<mailto:core-libs-dev@openjdk.org>; 
> leyden-...@openjdk.org<mailto:leyden-...@openjdk.org>

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-22 Thread Chen Liang
Hi Aman,
We have tried defining Proxy as hidden classes; a previous attempt was on
hold because of issues with serialization. Otherwise, Proxies work great as
hidden classes.

Chen

On Mon, May 20, 2024 at 7:56 AM Aman Sharma  wrote:

> Hi David,
>
>
> > I would not expect any class load
> events.
>
>
> I understand. I also haven't tried to intercept them but I see only one
> approach right now to include them in an allowlist - 1) statically look for
> invocations of "Lookup::defineHiddenClass". 2) Instrument them so that
> its first argument "bytes" can be looked into upon. I haven't looked into
> it much because I did not have much idea about it. And they are hidden so
> it made it worse.  Thanks for sharing the JEP!
>
>
> >
> java.lang.reflect.Proxy could define hidden classes to act as the proxy
> classes which implement proxy interfaces; from JEP 317
>
>
> It says that Proxy classes will also become hidden classes. Is it
> underway? Right now one can intercept, transform them, and include them in
> an allowlist. What do you think of naming them independent of AtomicLong so
> that a proxy class generated at runtime is easy to lookup in the allowlist?
>
>
>
> Regards,
> Aman Sharma
>
> PhD Student
> KTH Royal Institute of Technology
> School of Electrical Engineering and Computer Science (EECS)
> Department of Theoretical Computer Science (TCS)
> <http://www.kth.se> <https://www.kth.se/profile/amansha>
> <https://www.kth.se/profile/amansha>
> <https://www.kth.se/profile/amansha>https://algomaster99.github.io/
> --
> *From:* David Holmes 
> *Sent:* Monday, May 20, 2024 2:30:37 PM
> *To:* Aman Sharma; liangchenb...@gmail.com
> *Cc:* core-libs-dev@openjdk.org; leyden-...@openjdk.org
> *Subject:* Re: Deterministic naming of subclasses of
> `java/lang/reflect/Proxy`
>
> On 20/05/2024 10:12 pm, Aman Sharma wrote:
> > Hi David,
> >
> >
> >  > How did you try to intercept them? Hidden classes are not "loaded" in
> > the normal sense so won't trigger class load events.
> >
> >
> > I could not intercept them. I only see them when I pass `-verbose:class`
> > in the Java CLI.
>
> Yes that is why I asked how you tried to intercept them.
>
> >
> > I also couldn't intercept them using JVMTI Class File Load Hook
> > <
> https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassFileLoadHook>
> event. However JEP 371 suggests that it should be possible to intercept
> them using JVMTI Class Load <
> https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassLoad>
> event, but I won't have the bytecode at this stage. So is there no way to
> get its bytecode before it is linked and initialized in the JVM?
>
> Hidden classes are not loaded so I would not expect any class load
> events. However the exact nature of the JVMTI class load event is
> unclear as it talks about "class or interface creation" which is neither
> loading or defining per se. But a class prepare event sounds like it
> should be issued. However neither give you access to the bytecode of the
> class AFAICS.
>
> David
> -
>
>
> >
> > Regards,
> > Aman Sharma
> >
> > PhD Student
> > KTH Royal Institute of Technology
> > School of Electrical Engineering and Computer Science (EECS)
> > Department of Theoretical Computer Science (TCS)
> > <
> http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha
> >
> > <https://www.kth.se/profile/amansha>https://algomaster99.github.io/
> > <https://algomaster99.github.io/>
> > 
> > *From:* David Holmes 
> > *Sent:* Monday, May 20, 2024 2:59:17 AM
> > *To:* Aman Sharma; liangchenb...@gmail.com
> > *Cc:* core-libs-dev@openjdk.org; leyden-...@openjdk.org
> > *Subject:* Re: Deterministic naming of subclasses of
> > `java/lang/reflect/Proxy`
> > On 17/05/2024 9:43 pm, Aman Sharma wrote:
> >> Hi Chen,
> >>
> >>  > java.lang.invoke.LambdaForm$MH/0x0200cc000400
> >>
> >> I do see this as output when I pass -verbose:class. However, based on
> my
> >> experiments, I have seen that neither an agent passed via 'javaagent'
> >> nor an agent passed via 'agentpath' is able to intercept this hidden
> class.
> >
> > How did you try to intercept them? Hidden classes are not "loaded" in
> > the normal sense so won't trigger class load events.
> >
> >> Also, I was a 

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-20 Thread Aman Sharma
Hi David,


> I would not expect any class load
events.


I understand. I also haven't tried to intercept them but I see only one 
approach right now to include them in an allowlist - 1) statically look for 
invocations of "Lookup::defineHiddenClass". 2) Instrument them so that its 
first argument "bytes" can be looked into upon. I haven't looked into it much 
because I did not have much idea about it. And they are hidden so it made it 
worse.  Thanks for sharing the JEP!


>

java.lang.reflect.Proxy could define hidden classes to act as the proxy classes 
which implement proxy interfaces; from JEP 317


It says that Proxy classes will also become hidden classes. Is it underway? 
Right now one can intercept, transform them, and include them in an allowlist. 
What do you think of naming them independent of AtomicLong so that a proxy 
class generated at runtime is easy to lookup in the allowlist?



Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: David Holmes 
Sent: Monday, May 20, 2024 2:30:37 PM
To: Aman Sharma; liangchenb...@gmail.com
Cc: core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

On 20/05/2024 10:12 pm, Aman Sharma wrote:
> Hi David,
>
>
>  > How did you try to intercept them? Hidden classes are not "loaded" in
> the normal sense so won't trigger class load events.
>
>
> I could not intercept them. I only see them when I pass `-verbose:class`
> in the Java CLI.

Yes that is why I asked how you tried to intercept them.

>
> I also couldn't intercept them using JVMTI Class File Load Hook
> <https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassFileLoadHook>
>  event. However JEP 371 suggests that it should be possible to intercept them 
> using JVMTI Class Load 
> <https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassLoad> 
> event, but I won't have the bytecode at this stage. So is there no way to get 
> its bytecode before it is linked and initialized in the JVM?

Hidden classes are not loaded so I would not expect any class load
events. However the exact nature of the JVMTI class load event is
unclear as it talks about "class or interface creation" which is neither
loading or defining per se. But a class prepare event sounds like it
should be issued. However neither give you access to the bytecode of the
class AFAICS.

David
-


>
> Regards,
> Aman Sharma
>
> PhD Student
> KTH Royal Institute of Technology
> School of Electrical Engineering and Computer Science (EECS)
> Department of Theoretical Computer Science (TCS)
> <http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
> <https://www.kth.se/profile/amansha>https://algomaster99.github.io/
> <https://algomaster99.github.io/>
> --------
> *From:* David Holmes 
> *Sent:* Monday, May 20, 2024 2:59:17 AM
> *To:* Aman Sharma; liangchenb...@gmail.com
> *Cc:* core-libs-dev@openjdk.org; leyden-...@openjdk.org
> *Subject:* Re: Deterministic naming of subclasses of
> `java/lang/reflect/Proxy`
> On 17/05/2024 9:43 pm, Aman Sharma wrote:
>> Hi Chen,
>>
>>  > java.lang.invoke.LambdaForm$MH/0x0200cc000400
>>
>> I do see this as output when I pass -verbose:class. However, based on my
>> experiments, I have seen that neither an agent passed via 'javaagent'
>> nor an agent passed via 'agentpath' is able to intercept this hidden class.
>
> How did you try to intercept them? Hidden classes are not "loaded" in
> the normal sense so won't trigger class load events.
>
>> Also, I was a bit confused since I saw somewhere that the names of
>> hidden classes are null. But thanks for clarifying here.
>
> The JEP clearly defines the name format for hidden classes - though the
> final component is VM specific (and typically a hashcode).
>
> https://openjdk.org/jeps/371 <https://openjdk.org/jeps/371>
>
> Cheers,
> David
> -
>
>>  > avoid dynamic class loading
>>
>> I don't see dynamic class loading as a problem. I only mind some
>> unstable generation aspects of them which make it hard to verify them
>> based on an allowlist.
>>
>> For example, if this hidden class is generated with the exact same name
>> and the exact same bytecode during runtim

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-20 Thread David Holmes

On 20/05/2024 10:12 pm, Aman Sharma wrote:

Hi David,


 > How did you try to intercept them? Hidden classes are not "loaded" in
the normal sense so won't trigger class load events.


I could not intercept them. I only see them when I pass `-verbose:class` 
in the Java CLI.


Yes that is why I asked how you tried to intercept them.



I also couldn't intercept them using JVMTI Class File Load Hook 
<https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassFileLoadHook> event. However JEP 371 suggests that it should be possible to intercept them using JVMTI Class Load <https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassLoad> event, but I won't have the bytecode at this stage. So is there no way to get its bytecode before it is linked and initialized in the JVM?


Hidden classes are not loaded so I would not expect any class load 
events. However the exact nature of the JVMTI class load event is 
unclear as it talks about "class or interface creation" which is neither 
loading or defining per se. But a class prepare event sounds like it 
should be issued. However neither give you access to the bytecode of the 
class AFAICS.


David
-




Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/ 
<https://algomaster99.github.io/>


*From:* David Holmes 
*Sent:* Monday, May 20, 2024 2:59:17 AM
*To:* Aman Sharma; liangchenb...@gmail.com
*Cc:* core-libs-dev@openjdk.org; leyden-...@openjdk.org
*Subject:* Re: Deterministic naming of subclasses of 
`java/lang/reflect/Proxy`

On 17/05/2024 9:43 pm, Aman Sharma wrote:

Hi Chen,

  > java.lang.invoke.LambdaForm$MH/0x0200cc000400

I do see this as output when I pass -verbose:class. However, based on my 
experiments, I have seen that neither an agent passed via 'javaagent' 
nor an agent passed via 'agentpath' is able to intercept this hidden class.


How did you try to intercept them? Hidden classes are not "loaded" in
the normal sense so won't trigger class load events.

Also, I was a bit confused since I saw somewhere that the names of 
hidden classes are null. But thanks for clarifying here.


The JEP clearly defines the name format for hidden classes - though the
final component is VM specific (and typically a hashcode).

https://openjdk.org/jeps/371 <https://openjdk.org/jeps/371>

Cheers,
David
-


  > avoid dynamic class loading

I don't see dynamic class loading as a problem. I only mind some 
unstable generation aspects of them which make it hard to verify them 
based on an allowlist.


For example, if this hidden class is generated with the exact same name 
and the exact same bytecode during runtime as well, it would be easy to 
verify it. However, I do see the names are based on some sort of memory 
address so and I don't know what bytecode it has so I don't have 
suggestions to make them stable as of now. For Proxy classes, I feel it 
can be addressed unless you disagree or some involved in Project Leyden 
does. :) Thank you for forwarding my mail there.


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
https://algomaster99.github.io/ <https://algomaster99.github.io/> 

<https://algomaster99.github.io/ <https://algomaster99.github.io/>>



*From:* liangchenb...@gmail.com 
*Sent:* Friday, May 17, 2024 1:23:58 pm
*To:* Aman Sharma 
*Cc:* core-libs-dev@openjdk.org ; 
leyden-...@openjdk.org 
*Subject:* Re: Deterministic naming of subclasses of 
`java/lang/reflect/Proxy`


Hi Aman,
For `-verbose:class`, it's a JVM argument instead of a program argument; 
so when you run a java program like `java Main`, you should call it as 
`java -verbose:class Main`.

When done correctly, you should see hidden class outputs like:
[0.032s][info][class,load] 
java.lang.invoke.LambdaForm$MH/0x0200cc000400 source: 
__JVM_LookupDefineClass__
The loading of java.lang.invoke hidden classes requires your program to 
use MethodHandle features, like a lambda.


I think the problem you are exploring, that to avoid dynamic class 
loading and effectively turn Java Platform closed for security, is also 
being accomplished by project Leyden (as I've shared initially); Thus, I 
am forwarding this to leyden-dev instead, so you can see what approach 
Leyden uses to accomplish the same goal as yours.


Regards, Chen Liang

On Fri, May 17, 2024 at 4:40 AM Aman Sharma <mailto:aman...@kth.se <mailto:aman...@kth.se>>> wrote:


 __

 Hi Roger,


 Do you have ideas on how to intercept them

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-20 Thread Aman Sharma
Hi David,


> How did you try to intercept them? Hidden classes are not "loaded" in
the normal sense so won't trigger class load events.


I could not intercept them. I only see them when I pass `-verbose:class` in the 
Java CLI.


I also couldn't intercept them using JVMTI Class File Load 
Hook<https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassFileLoadHook>
 event. However JEP 371 suggests that it should be possible to intercept them 
using JVMTI Class 
Load<https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassLoad> 
event, but I won't have the bytecode at this stage. So is there no way to get 
its bytecode before it is linked and initialized in the JVM?


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: David Holmes 
Sent: Monday, May 20, 2024 2:59:17 AM
To: Aman Sharma; liangchenb...@gmail.com
Cc: core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

On 17/05/2024 9:43 pm, Aman Sharma wrote:
> Hi Chen,
>
>  > java.lang.invoke.LambdaForm$MH/0x0200cc000400
>
> I do see this as output when I pass -verbose:class. However, based on my
> experiments, I have seen that neither an agent passed via 'javaagent'
> nor an agent passed via 'agentpath' is able to intercept this hidden class.

How did you try to intercept them? Hidden classes are not "loaded" in
the normal sense so won't trigger class load events.

> Also, I was a bit confused since I saw somewhere that the names of
> hidden classes are null. But thanks for clarifying here.

The JEP clearly defines the name format for hidden classes - though the
final component is VM specific (and typically a hashcode).

https://openjdk.org/jeps/371

Cheers,
David
-

>  > avoid dynamic class loading
>
> I don't see dynamic class loading as a problem. I only mind some
> unstable generation aspects of them which make it hard to verify them
> based on an allowlist.
>
> For example, if this hidden class is generated with the exact same name
> and the exact same bytecode during runtime as well, it would be easy to
> verify it. However, I do see the names are based on some sort of memory
> address so and I don't know what bytecode it has so I don't have
> suggestions to make them stable as of now. For Proxy classes, I feel it
> can be addressed unless you disagree or some involved in Project Leyden
> does. :) Thank you for forwarding my mail there.
>
> Regards,
> Aman Sharma
>
> PhD Student
> KTH Royal Institute of Technology
> https://algomaster99.github.io/ <https://algomaster99.github.io/>
>
> 
> *From:* liangchenb...@gmail.com 
> *Sent:* Friday, May 17, 2024 1:23:58 pm
> *To:* Aman Sharma 
> *Cc:* core-libs-dev@openjdk.org ;
> leyden-...@openjdk.org 
> *Subject:* Re: Deterministic naming of subclasses of
> `java/lang/reflect/Proxy`
>
> Hi Aman,
> For `-verbose:class`, it's a JVM argument instead of a program argument;
> so when you run a java program like `java Main`, you should call it as
> `java -verbose:class Main`.
> When done correctly, you should see hidden class outputs like:
> [0.032s][info][class,load]
> java.lang.invoke.LambdaForm$MH/0x0200cc000400 source:
> __JVM_LookupDefineClass__
> The loading of java.lang.invoke hidden classes requires your program to
> use MethodHandle features, like a lambda.
>
> I think the problem you are exploring, that to avoid dynamic class
> loading and effectively turn Java Platform closed for security, is also
> being accomplished by project Leyden (as I've shared initially); Thus, I
> am forwarding this to leyden-dev instead, so you can see what approach
> Leyden uses to accomplish the same goal as yours.
>
> Regards, Chen Liang
>
> On Fri, May 17, 2024 at 4:40 AM Aman Sharma  <mailto:aman...@kth.se>> wrote:
>
> __
>
> Hi Roger,
>
>
> Do you have ideas on how to intercept them? My javaagent is not able
> to nor a JVMTI agent passed using `agentpath` option. It also does
> not seem to show up in logs when I pass `-verbose:class`.
>
>
> Also, what do you think of renaming the proxy classes as suggested
> below?
>
>
> Regards,
> Aman Sharma
>
> PhD Student
> KTH Royal Institute of Technology
> School o

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-19 Thread David Holmes

On 17/05/2024 9:43 pm, Aman Sharma wrote:

Hi Chen,

 > java.lang.invoke.LambdaForm$MH/0x0200cc000400

I do see this as output when I pass -verbose:class. However, based on my 
experiments, I have seen that neither an agent passed via 'javaagent' 
nor an agent passed via 'agentpath' is able to intercept this hidden class.


How did you try to intercept them? Hidden classes are not "loaded" in 
the normal sense so won't trigger class load events.


Also, I was a bit confused since I saw somewhere that the names of 
hidden classes are null. But thanks for clarifying here.


The JEP clearly defines the name format for hidden classes - though the 
final component is VM specific (and typically a hashcode).


https://openjdk.org/jeps/371

Cheers,
David
-


 > avoid dynamic class loading

I don't see dynamic class loading as a problem. I only mind some 
unstable generation aspects of them which make it hard to verify them 
based on an allowlist.


For example, if this hidden class is generated with the exact same name 
and the exact same bytecode during runtime as well, it would be easy to 
verify it. However, I do see the names are based on some sort of memory 
address so and I don't know what bytecode it has so I don't have 
suggestions to make them stable as of now. For Proxy classes, I feel it 
can be addressed unless you disagree or some involved in Project Leyden 
does. :) Thank you for forwarding my mail there.


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
https://algomaster99.github.io/ <https://algomaster99.github.io/>


*From:* liangchenb...@gmail.com 
*Sent:* Friday, May 17, 2024 1:23:58 pm
*To:* Aman Sharma 
*Cc:* core-libs-dev@openjdk.org ; 
leyden-...@openjdk.org 
*Subject:* Re: Deterministic naming of subclasses of 
`java/lang/reflect/Proxy`


Hi Aman,
For `-verbose:class`, it's a JVM argument instead of a program argument; 
so when you run a java program like `java Main`, you should call it as 
`java -verbose:class Main`.

When done correctly, you should see hidden class outputs like:
[0.032s][info][class,load] 
java.lang.invoke.LambdaForm$MH/0x0200cc000400 source: 
__JVM_LookupDefineClass__
The loading of java.lang.invoke hidden classes requires your program to 
use MethodHandle features, like a lambda.


I think the problem you are exploring, that to avoid dynamic class 
loading and effectively turn Java Platform closed for security, is also 
being accomplished by project Leyden (as I've shared initially); Thus, I 
am forwarding this to leyden-dev instead, so you can see what approach 
Leyden uses to accomplish the same goal as yours.


Regards, Chen Liang

On Fri, May 17, 2024 at 4:40 AM Aman Sharma <mailto:aman...@kth.se>> wrote:


__

Hi Roger,


Do you have ideas on how to intercept them? My javaagent is not able
to nor a JVMTI agent passed using `agentpath` option. It also does
not seem to show up in logs when I pass `-verbose:class`.


Also, what do you think of renaming the proxy classes as suggested
below?


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)

<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/
<https://algomaster99.github.io/>

*From:* core-libs-dev mailto:core-libs-dev-r...@openjdk.org>> on behalf of Roger Riggs
mailto:roger.ri...@oracle.com>>
*Sent:* Friday, May 17, 2024 4:57:46 AM
*To:* core-libs-dev@openjdk.org <mailto:core-libs-dev@openjdk.org>
    *Subject:* Re: Deterministic naming of subclasses of
`java/lang/reflect/Proxy`
Hi Aman,

You may also run into hidden classes (JEP 371: Hidden Classes) that
allow classes to be defined, at runtime, without names.
It has been proposed to use them for generated proxies but that
hasn't been implemented yet.
There are benefits to having nameless classes, because they can't be
referenced by name, only as a capability, they can be better
encapsulated.

fyi, Roger Riggs


On 5/16/24 8:11 AM, Aman Sharma wrote:


Hi,


Thanks for your response, Liang!


> I think you meant CVE-2021-42392 instead of 2022.


Sorry of the error. I indeed meant CVE-2021-42392
<https://nvd.nist.gov/vuln/detail/cve-2021-42392>.


> Leyden mainly avoids this unstable generation by performing a
training run to collect classes loaded


Would love to know the details of Project Leyden and how they
worked so far to focus on this goal. In our case, the training run
is the test suite.


>

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-17 Thread Aman Sharma
Hi Chen,

> java.lang.invoke.LambdaForm$MH/0x0200cc000400

I do see this as output when I pass -verbose:class. However, based on my 
experiments, I have seen that neither an agent passed via 'javaagent' nor an 
agent passed via 'agentpath' is able to intercept this hidden class.

Also, I was a bit confused since I saw somewhere that the names of hidden 
classes are null. But thanks for clarifying here.

> avoid dynamic class loading

I don't see dynamic class loading as a problem. I only mind some unstable 
generation aspects of them which make it hard to verify them based on an 
allowlist.

For example, if this hidden class is generated with the exact same name and the 
exact same bytecode during runtime as well, it would be easy to verify it. 
However, I do see the names are based on some sort of memory address so and I 
don't know what bytecode it has so I don't have suggestions to make them stable 
as of now. For Proxy classes, I feel it can be addressed unless you disagree or 
some involved in Project Leyden does. :) Thank you for forwarding my mail there.

Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
https://algomaster99.github.io/


From: liangchenb...@gmail.com 
Sent: Friday, May 17, 2024 1:23:58 pm
To: Aman Sharma 
Cc: core-libs-dev@openjdk.org ; 
leyden-...@openjdk.org 
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

Hi Aman,
For `-verbose:class`, it's a JVM argument instead of a program argument; so 
when you run a java program like `java Main`, you should call it as `java 
-verbose:class Main`.
When done correctly, you should see hidden class outputs like:
[0.032s][info][class,load] java.lang.invoke.LambdaForm$MH/0x0200cc000400 
source: __JVM_LookupDefineClass__
The loading of java.lang.invoke hidden classes requires your program to use 
MethodHandle features, like a lambda.

I think the problem you are exploring, that to avoid dynamic class loading and 
effectively turn Java Platform closed for security, is also being accomplished 
by project Leyden (as I've shared initially); Thus, I am forwarding this to 
leyden-dev instead, so you can see what approach Leyden uses to accomplish the 
same goal as yours.

Regards, Chen Liang

On Fri, May 17, 2024 at 4:40 AM Aman Sharma 
mailto:aman...@kth.se>> wrote:

Hi Roger,


Do you have ideas on how to intercept them? My javaagent is not able to nor a 
JVMTI agent passed using `agentpath` option. It also does not seem to show up 
in logs when I pass `-verbose:class`.


Also, what do you think of renaming the proxy classes as suggested below?


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: core-libs-dev 
mailto:core-libs-dev-r...@openjdk.org>> on 
behalf of Roger Riggs mailto:roger.ri...@oracle.com>>
Sent: Friday, May 17, 2024 4:57:46 AM
To: core-libs-dev@openjdk.org<mailto:core-libs-dev@openjdk.org>
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

Hi Aman,

You may also run into hidden classes (JEP 371: Hidden Classes) that allow 
classes to be defined, at runtime, without names.
It has been proposed to use them for generated proxies but that hasn't been 
implemented yet.
There are benefits to having nameless classes, because they can't be referenced 
by name, only as a capability, they can be better encapsulated.

fyi, Roger Riggs


On 5/16/24 8:11 AM, Aman Sharma wrote:

Hi,


Thanks for your response, Liang!


> I think you meant CVE-2021-42392 instead of 2022.


Sorry of the error. I indeed meant 
CVE-2021-42392<https://nvd.nist.gov/vuln/detail/cve-2021-42392>.


> Leyden mainly avoids this unstable generation by performing a training run to 
> collect classes loaded


Would love to know the details of Project Leyden and how they worked so far to 
focus on this goal. In our case, the training run is the test suite.


> GeneratedConstructorAccessor is already retired by JEP 416 [2] in Java 18


I did see them not appearing in my allowlist when I ran my study subject 
(Apache PDFBox) with Java 21. Thanks for letting me know about this JEP. I see 
they are re-implemented with method handles.


> How are you checking the classes?


To detect runtime generated code, we have javaagent that is hooked statically 
to the test suite execution. It gives us all classes that that is loaded post 
the JVM and the javaagent are loaded. So we only check the classes loaded for 
the purpose of running the application. This is also why we did not choose 
-agentlib as it would give classes for the setting up JVM and javaagent and we 
the user of o

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-17 Thread -
Hi Aman,
For `-verbose:class`, it's a JVM argument instead of a program argument; so
when you run a java program like `java Main`, you should call it as `java
-verbose:class Main`.
When done correctly, you should see hidden class outputs like:
[0.032s][info][class,load]
java.lang.invoke.LambdaForm$MH/0x0200cc000400 source:
__JVM_LookupDefineClass__
The loading of java.lang.invoke hidden classes requires your program to use
MethodHandle features, like a lambda.

I think the problem you are exploring, that to avoid dynamic class loading
and effectively turn Java Platform closed for security, is also being
accomplished by project Leyden (as I've shared initially); Thus, I am
forwarding this to leyden-dev instead, so you can see what approach Leyden
uses to accomplish the same goal as yours.

Regards, Chen Liang

On Fri, May 17, 2024 at 4:40 AM Aman Sharma  wrote:

> Hi Roger,
>
>
> Do you have ideas on how to intercept them? My javaagent is not able to
> nor a JVMTI agent passed using `agentpath` option. It also does not seem to
> show up in logs when I pass `-verbose:class`.
>
>
> Also, what do you think of renaming the proxy classes as suggested below?
>
>
> Regards,
> Aman Sharma
>
> PhD Student
> KTH Royal Institute of Technology
> School of Electrical Engineering and Computer Science (EECS)
> Department of Theoretical Computer Science (TCS)
> <http://www.kth.se> <https://www.kth.se/profile/amansha>
> <https://www.kth.se/profile/amansha>
> <https://www.kth.se/profile/amansha>https://algomaster99.github.io/
> --
> *From:* core-libs-dev  on behalf of Roger
> Riggs 
> *Sent:* Friday, May 17, 2024 4:57:46 AM
> *To:* core-libs-dev@openjdk.org
> *Subject:* Re: Deterministic naming of subclasses of
> `java/lang/reflect/Proxy`
>
> Hi Aman,
>
> You may also run into hidden classes (JEP 371: Hidden Classes) that allow
> classes to be defined, at runtime, without names.
> It has been proposed to use them for generated proxies but that hasn't
> been implemented yet.
> There are benefits to having nameless classes, because they can't be
> referenced by name, only as a capability, they can be better encapsulated.
>
> fyi, Roger Riggs
>
>
> On 5/16/24 8:11 AM, Aman Sharma wrote:
>
> Hi,
>
>
> Thanks for your response, Liang!
>
>
> > I think you meant CVE-2021-42392 instead of 2022.
>
>
> Sorry of the error. I indeed meant CVE-2021-42392
> <https://nvd.nist.gov/vuln/detail/cve-2021-42392>.
>
>
> > Leyden mainly avoids this unstable generation by performing a training
> run to collect classes loaded
>
>
> Would love to know the details of Project Leyden and how they worked so
> far to focus on this goal. In our case, the training run is the test suite.
>
>
> > GeneratedConstructorAccessor is already retired by JEP 416 [2] in Java
> 18
>
>
> I did see them not appearing in my allowlist when I ran my study subject
> (Apache PDFBox) with Java 21. Thanks for letting me know about this JEP. I
> see they are re-implemented with method handles.
>
>
> > How are you checking the classes?
>
>
> To detect runtime generated code, we have javaagent that is hooked
> statically to the test suite execution. It gives us all classes that that
> is loaded post the JVM and the javaagent are loaded. So we only check the
> classes loaded for the purpose of running the application. This is also why
> we did not choose -agentlib as it would give classes for the setting up JVM
> and javaagent and we the user of our tool must the classes they load.
>
>
> Next, we have a `ClassFileTransformer` hook in the agent where we produce
> the checksum using the bytecode. And we compare the checksum with the one
> existing in the allowlist. The checksum computation algorithm is same for
> both steps. Let me describe how I compute the checksum.
>
>
>
>1. I get the CONSTANT_Class_info
>
> <https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.1>
>entry corresponding to `this_class` and rewrite the CONSTANT_Utf8_info
>
> <https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.7>
>corresponding to a fix String constant, say "foo".
>2. Since, the name of the class is used to refer to its types members
>(fields/method), I get all CONSTANT_Fieldref_info
>
> <https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.2>
>and if its `class_index` corresponds to the old `this_class`, we rewrite
>the UTF8 value of class_index to the same constant "foo".
>3. Next, since the naming of the fields, in Proxy classes, are also
>suffixed by numbers, 

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-17 Thread Aman Sharma
Hi Roger,


Do you have ideas on how to intercept them? My javaagent is not able to nor a 
JVMTI agent passed using `agentpath` option. It also does not seem to show up 
in logs when I pass `-verbose:class`.


Also, what do you think of renaming the proxy classes as suggested below?


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: core-libs-dev  on behalf of Roger Riggs 

Sent: Friday, May 17, 2024 4:57:46 AM
To: core-libs-dev@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

Hi Aman,

You may also run into hidden classes (JEP 371: Hidden Classes) that allow 
classes to be defined, at runtime, without names.
It has been proposed to use them for generated proxies but that hasn't been 
implemented yet.
There are benefits to having nameless classes, because they can't be referenced 
by name, only as a capability, they can be better encapsulated.

fyi, Roger Riggs


On 5/16/24 8:11 AM, Aman Sharma wrote:

Hi,


Thanks for your response, Liang!


> I think you meant CVE-2021-42392 instead of 2022.


Sorry of the error. I indeed meant 
CVE-2021-42392<https://nvd.nist.gov/vuln/detail/cve-2021-42392>.


> Leyden mainly avoids this unstable generation by performing a training run to 
> collect classes loaded


Would love to know the details of Project Leyden and how they worked so far to 
focus on this goal. In our case, the training run is the test suite.


> GeneratedConstructorAccessor is already retired by JEP 416 [2] in Java 18


I did see them not appearing in my allowlist when I ran my study subject 
(Apache PDFBox) with Java 21. Thanks for letting me know about this JEP. I see 
they are re-implemented with method handles.


> How are you checking the classes?


To detect runtime generated code, we have javaagent that is hooked statically 
to the test suite execution. It gives us all classes that that is loaded post 
the JVM and the javaagent are loaded. So we only check the classes loaded for 
the purpose of running the application. This is also why we did not choose 
-agentlib as it would give classes for the setting up JVM and javaagent and we 
the user of our tool must the classes they load.


Next, we have a `ClassFileTransformer` hook in the agent where we produce the 
checksum using the bytecode. And we compare the checksum with the one existing 
in the allowlist. The checksum computation algorithm is same for both steps. 
Let me describe how I compute the checksum.


  1.  I get the 
CONSTANT_Class_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.1>
 entry corresponding to `this_class` and rewrite the 
CONSTANT_Utf8_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.7>
 corresponding to a fix String constant, say "foo".
  2.  Since, the name of the class is used to refer to its types members 
(fields/method), I get all 
CONSTANT_Fieldref_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.2>
 and if its `class_index` corresponds to the old `this_class`, we rewrite the 
UTF8 value of class_index to the same constant "foo".
  3.  Next, since the naming of the fields, in Proxy classes, are also suffixed 
by numbers, for example, `private static Method m4`, we rewrite the UTF8 value 
of name in the 
CONSTANT_NameAndType_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.6>.
  4.  These fields can also have a random order so we simply sort the entire 
byte code using `Arrays.sort(byte[])` to eliminate any differences due to 
ordering of fields/methods.
  5.  Simply sorting the byte array still had minute differences. I could not 
understand why they existed even though values in constant pool of the bytecode 
in allowlist and at runtime were exactly the same after rewriting. The 
differences existed in the bytes of the Code attribute of methods. I concluded 
that the bytes stored some position information. To avoid this, I created a 
subarray where I considered the bytes corresponding to 
`CONSTANT_Utf8_info.bytes` only. Computing a checksum for it resulted in the 
same checksums for both classfiles.

Let's understand the whole approach with an example of Proxy class.

`

public final class $Proxy42 extends Proxy implements 
org.apache.logging.log4j.core.config.plugins.Plugin {

`

The will go in the allowlist as "Proxy_Plugin: ".

When the same class is intercepted at runtime, say "$Proxy10", we look for 
"Proxy_Plugin" in the allowlist and since the checksum algorithm is same in 
both cases, we get a match a

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-16 Thread Roger Riggs
ineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/ 
<https://algomaster99.github.io/>


*From:* liangchenb...@gmail.com 
*Sent:* Thursday, May 16, 2024 5:52:03 AM
*To:* Aman Sharma; core-libs-dev
*Cc:* Martin Monperrus
*Subject:* Re: Deterministic naming of subclasses of 
`java/lang/reflect/Proxy`

Hi Aman,
I think you meant CVE-2021-42392 instead of 2022.

For your approach of an "allowlist" for Java runtime, project Leyden 
is looking to generate a static image [1], that
> At run time it cannot load classes from outside the image, nor can 
it create classes dynamically.
Leyden mainly avoids this unstable generation by performing a training 
run to collect classes loaded and even object graphs; I am not 
familiar with the details unfortunately.


Otherwise, the Proxy discussion belongs better to core-libs-dev, as 
java.lang.reflect.Proxy is part of Java's core libraries. I am 
replying this thread to core-libs-dev.


For your perceived problem that classes don't have unique names, your 
description sounds dubious: GeneratedConstructorAccessor is already 
retired by JEP 416 [2] in Java 18, and there are many other cases in 
which JDK generates classes without stable names, notoriously 
LambdaMetafactory (Gradle wished for cacheable Lambdas); the same 
applies for the generated classes for MethodHandle's LambdaForms 
(which carries implementation code for LambdaForm). How are you 
checking the classes? It seems you are not checking hidden classes. 
Proxy and Lambda classes are defined by the caller's class loader, 
while LambdaForms are under JDK's system class loader I think. We need 
to ensure you are correctly finding all unstable classes before we can 
proceed.


[1]: https://openjdk.org/projects/leyden/notes/01-beginnings
[2]: https://openjdk.org/jeps/416

On Wed, May 15, 2024 at 7:00 PM Aman Sharma  wrote:

Hi,


My name is Aman and I am a PhD student at KTH Royal Institute of
Technology, Stockholm, Sweden. I research as part of CHAINS
<https://chains.proj.kth.se/> project to strengthen the software
supply chain of multiple ecosystem. I particularly focus on
runtime integrity in Java. In this email, I want to write about an
issue I have discovered with /dynamic generation of
`java.lang.reflect.Proxy`classes/. I will propose a solution and
would love to hear the feedback from the community. Let me know if
this is the correct mailing-list for such discussions. It seemed
the most relevant from this list
<https://mail.openjdk.org/mailman/listinfo>.


*My research*

*
*

Java has features to load class on the fly - it can either
download or generate a class at runtime. These features are useful
for inner workings of JDK. For example, implementing annotations,
reflective access, etc. However, these features have also
contributed to critical vulnerabilities in the past
- CVE-2021-44228 (log4shell), CVE-2022-33980, CVE-2022-42392. All
of these vulnerabilities have one thing in common - /a class that
was not known during build time was downloaded/generated at
runtime and loaded into JVM./


To defend against such vulnerabilities, we propose a solution to
/allowlist classes for runtime/. This allowlist will contain an
exhaustive list of classes that can be loaded by the JVM and it
will be enforced at runtime. We build this allowlist from three
sources:

 1. All classes of all modules provided by the Java Standard
Library. We use ClassGraph
<https://github.com/classgraph/classgraph> to scan the JDK.
 2. We can take the source code and all dependencies of an
application. We use a software bill of materials to get all
the data.
 3. Finally, we use run the test suite to include any runtime
downloaded/generated classes.

Such a list is able to prevent the above 3 CVEs because it does
not let the "unknown" bytecode to be loaded.

*Problem with generating such an allowlist*
*
*
The first two parts of the allowlist are easy to get. The problem
is with the third step where we want to allowlist all the classes
that could be downloaded or generated. Upon running the test suite
and hooking to the classes it loads, we observer that the list
consists of classes that are called "com/sun/proxy/$Proxy2",
"jdk/internal/reflect/GeneratedConstructorAccessor3" among many
more. The purpose of these classes can be identifed. The proxy
class is created for to implement an annotation. The accessor
gives access to constructor of a class to the JVM.

When enforcing this allowlist at runtime, we see that the bytecode
content for "com/sun/proxy/$Proxy2" diff

Re: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-16 Thread Aman Sharma
Hi,


> have not looked into LambdaMetafactory because I did not encounter it as a 
> problem so far


It is possible that java agents are unable to intercept it. `-verbose:class` 
logs classes such as 
"org.apache.pdfbox.cos.COSDocument$$Lambda/0x7a80631a0d08".


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: Aman Sharma
Sent: Thursday, May 16, 2024 2:11:59 PM
To: liangchenb...@gmail.com; core-libs-dev
Cc: Martin Monperrus
Subject: Re: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`


Hi,


Thanks for your response, Liang!


> I think you meant CVE-2021-42392 instead of 2022.


Sorry of the error. I indeed meant 
CVE-2021-42392<https://nvd.nist.gov/vuln/detail/cve-2021-42392>.


> Leyden mainly avoids this unstable generation by performing a training run to 
> collect classes loaded


Would love to know the details of Project Leyden and how they worked so far to 
focus on this goal. In our case, the training run is the test suite.


> GeneratedConstructorAccessor is already retired by JEP 416 [2] in Java 18


I did see them not appearing in my allowlist when I ran my study subject 
(Apache PDFBox) with Java 21. Thanks for letting me know about this JEP. I see 
they are re-implemented with method handles.


> How are you checking the classes?


To detect runtime generated code, we have javaagent that is hooked statically 
to the test suite execution. It gives us all classes that that is loaded post 
the JVM and the javaagent are loaded. So we only check the classes loaded for 
the purpose of running the application. This is also why we did not choose 
-agentlib as it would give classes for the setting up JVM and javaagent and we 
the user of our tool must the classes they load.


Next, we have a `ClassFileTransformer` hook in the agent where we produce the 
checksum using the bytecode. And we compare the checksum with the one existing 
in the allowlist. The checksum computation algorithm is same for both steps. 
Let me describe how I compute the checksum.


  1.  I get the 
CONSTANT_Class_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.1>
 entry corresponding to `this_class` and rewrite the 
CONSTANT_Utf8_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.7>
 corresponding to a fix String constant, say "foo".
  2.  Since, the name of the class is used to refer to its types members 
(fields/method), I get all 
CONSTANT_Fieldref_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.2>
 and if its `class_index` corresponds to the old `this_class`, we rewrite the 
UTF8 value of class_index to the same constant "foo".
  3.  Next, since the naming of the fields, in Proxy classes, are also suffixed 
by numbers, for example, `private static Method m4`, we rewrite the UTF8 value 
of name in the 
CONSTANT_NameAndType_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.6>.
  4.  These fields can also have a random order so we simply sort the entire 
byte code using `Arrays.sort(byte[])` to eliminate any differences due to 
ordering of fields/methods.
  5.  Simply sorting the byte array still had minute differences. I could not 
understand why they existed even though values in constant pool of the bytecode 
in allowlist and at runtime were exactly the same after rewriting. The 
differences existed in the bytes of the Code attribute of methods. I concluded 
that the bytes stored some position information. To avoid this, I created a 
subarray where I considered the bytes corresponding to 
`CONSTANT_Utf8_info.bytes` only. Computing a checksum for it resulted in the 
same checksums for both classfiles.

Let's understand the whole approach with an example of Proxy class.

`

public final class $Proxy42 extends Proxy implements 
org.apache.logging.log4j.core.config.plugins.Plugin {

`

The will go in the allowlist as "Proxy_Plugin: ".

When the same class is intercepted at runtime, say "$Proxy10", we look for 
"Proxy_Plugin" in the allowlist and since the checksum algorithm is same in 
both cases, we get a match and let the class load.

This approach has seemed to work well for Proxy classes, Generated Constructor 
Accessor (which is removed as you said). I also looked at the species generated 
by method handles. I did not notice any modification in them. Their name 
generation seemed okay to me. If some new Species are generated, it is of 
course detected since it is not in the allowlist.

I have not looked into LambdaMetafactory becau

Re: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-16 Thread Aman Sharma
Hi,


Thanks for your response, Liang!


> I think you meant CVE-2021-42392 instead of 2022.


Sorry of the error. I indeed meant 
CVE-2021-42392<https://nvd.nist.gov/vuln/detail/cve-2021-42392>.


> Leyden mainly avoids this unstable generation by performing a training run to 
> collect classes loaded


Would love to know the details of Project Leyden and how they worked so far to 
focus on this goal. In our case, the training run is the test suite.


> GeneratedConstructorAccessor is already retired by JEP 416 [2] in Java 18


I did see them not appearing in my allowlist when I ran my study subject 
(Apache PDFBox) with Java 21. Thanks for letting me know about this JEP. I see 
they are re-implemented with method handles.


> How are you checking the classes?


To detect runtime generated code, we have javaagent that is hooked statically 
to the test suite execution. It gives us all classes that that is loaded post 
the JVM and the javaagent are loaded. So we only check the classes loaded for 
the purpose of running the application. This is also why we did not choose 
-agentlib as it would give classes for the setting up JVM and javaagent and we 
the user of our tool must the classes they load.


Next, we have a `ClassFileTransformer` hook in the agent where we produce the 
checksum using the bytecode. And we compare the checksum with the one existing 
in the allowlist. The checksum computation algorithm is same for both steps. 
Let me describe how I compute the checksum.


  1.  I get the 
CONSTANT_Class_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.1>
 entry corresponding to `this_class` and rewrite the 
CONSTANT_Utf8_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.7>
 corresponding to a fix String constant, say "foo".
  2.  Since, the name of the class is used to refer to its types members 
(fields/method), I get all 
CONSTANT_Fieldref_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.2>
 and if its `class_index` corresponds to the old `this_class`, we rewrite the 
UTF8 value of class_index to the same constant "foo".
  3.  Next, since the naming of the fields, in Proxy classes, are also suffixed 
by numbers, for example, `private static Method m4`, we rewrite the UTF8 value 
of name in the 
CONSTANT_NameAndType_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.6>.
  4.  These fields can also have a random order so we simply sort the entire 
byte code using `Arrays.sort(byte[])` to eliminate any differences due to 
ordering of fields/methods.
  5.  Simply sorting the byte array still had minute differences. I could not 
understand why they existed even though values in constant pool of the bytecode 
in allowlist and at runtime were exactly the same after rewriting. The 
differences existed in the bytes of the Code attribute of methods. I concluded 
that the bytes stored some position information. To avoid this, I created a 
subarray where I considered the bytes corresponding to 
`CONSTANT_Utf8_info.bytes` only. Computing a checksum for it resulted in the 
same checksums for both classfiles.

Let's understand the whole approach with an example of Proxy class.

`

public final class $Proxy42 extends Proxy implements 
org.apache.logging.log4j.core.config.plugins.Plugin {

`

The will go in the allowlist as "Proxy_Plugin: ".

When the same class is intercepted at runtime, say "$Proxy10", we look for 
"Proxy_Plugin" in the allowlist and since the checksum algorithm is same in 
both cases, we get a match and let the class load.

This approach has seemed to work well for Proxy classes, Generated Constructor 
Accessor (which is removed as you said). I also looked at the species generated 
by method handles. I did not notice any modification in them. Their name 
generation seemed okay to me. If some new Species are generated, it is of 
course detected since it is not in the allowlist.

I have not looked into LambdaMetafactory because I did not encounter it as a 
problem so far, but I am aware its name generation is also unstable. I have run 
my approach only a few projects only. And for hidden classes, I assume the the 
agent won't be able to intercept them so detecting them would be really hard.


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: liangchenb...@gmail.com 
Sent: Thursday, May 16, 2024 5:52:03 AM
To: Aman Sharma; core-libs-dev
Cc: Martin Monperrus
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

Hi Aman,

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-15 Thread -
Hi Aman,
I think you meant CVE-2021-42392 instead of 2022.

For your approach of an "allowlist" for Java runtime, project Leyden is
looking to generate a static image [1], that
> At run time it cannot load classes from outside the image, nor can it
create classes dynamically.
Leyden mainly avoids this unstable generation by performing a training run
to collect classes loaded and even object graphs; I am not familiar with
the details unfortunately.

Otherwise, the Proxy discussion belongs better to core-libs-dev, as
java.lang.reflect.Proxy is part of Java's core libraries. I am replying
this thread to core-libs-dev.

For your perceived problem that classes don't have unique names, your
description sounds dubious: GeneratedConstructorAccessor is already retired
by JEP 416 [2] in Java 18, and there are many other cases in which JDK
generates classes without stable names, notoriously LambdaMetafactory
(Gradle wished for cacheable Lambdas); the same applies for the generated
classes for MethodHandle's LambdaForms (which carries implementation code
for LambdaForm). How are you checking the classes? It seems you are not
checking hidden classes. Proxy and Lambda classes are defined by the
caller's class loader, while LambdaForms are under JDK's system class
loader I think. We need to ensure you are correctly finding all unstable
classes before we can proceed.

[1]: https://openjdk.org/projects/leyden/notes/01-beginnings
[2]: https://openjdk.org/jeps/416

On Wed, May 15, 2024 at 7:00 PM Aman Sharma  wrote:

> Hi,
>
>
> My name is Aman and I am a PhD student at KTH Royal Institute of
> Technology, Stockholm, Sweden. I research as part of CHAINS
>  project to strengthen the software supply
> chain of multiple ecosystem. I particularly focus on runtime integrity in
> Java. In this email, I want to write about an issue I have discovered with 
> *dynamic
> generation of `java.lang.reflect.Proxy`classes*. I will propose a
> solution and would love to hear the feedback from the community. Let me
> know if this is the correct mailing-list for such discussions. It seemed
> the most relevant from this list
> .
>
>
> *My research*
>
>
> Java has features to load class on the fly - it can either download or
> generate a class at runtime. These features are useful for inner workings
> of JDK. For example, implementing annotations, reflective access, etc.
> However, these features have also contributed to critical vulnerabilities
> in the past - CVE-2021-44228  (log4shell), CVE-2022-33980, CVE-2022-42392.
> All of these vulnerabilities have one thing in common - *a class that was
> not known during build time was downloaded/generated at runtime and loaded
> into JVM.*
>
>
> To defend against such vulnerabilities, we propose a solution to *allowlist
> classes for runtime*. This allowlist will contain an exhaustive list of
> classes that can be loaded by the JVM and it will be enforced at runtime.
> We build this allowlist from three sources:
>
>1. All classes of all modules provided by the Java Standard Library.
>We use ClassGraph  to scan
>the JDK.
>2. We can take the source code and all dependencies of an application.
>We use a software bill of materials to get all the data.
>3. Finally, we use run the test suite to include any runtime
>downloaded/generated classes.
>
> Such a list is able to prevent the above 3 CVEs because it does not let
> the "unknown" bytecode to be loaded.
>
> *Problem with generating such an allowlist*
>
> The first two parts of the allowlist are easy to get. The problem is with
> the third step where we want to allowlist all the classes that could be
> downloaded or generated. Upon running the test suite and hooking to the
> classes it loads, we observer that the list consists of classes that are
> called "com/sun/proxy/$Proxy2", "
> jdk/internal/reflect/GeneratedConstructorAccessor3" among many more. The
> purpose of these classes can be identifed. The proxy class is created for
> to implement an annotation. The accessor gives access to constructor of a
> class to the JVM.
>
> When enforcing this allowlist at runtime, we see that the bytecode content
> for "com/sun/proxy/$Proxy2" differs in the allowlist and at runtime. In
> our case, we we are experimenting with pdfbox
>  so we created the allowlist using its
> test suite. Then we enforced this allowlist while running some of its
> subcommands. However, there was some other proxy class say 
> "com/sun/proxy/$Proxy5"
> at runtime that implemented the same interfaces and had the same methods as
> "com/sun/proxy/$Proxy2" in the allowlist. They only differed in the name
> of the class, order of fields, and types for fields references. This could
> happen because the order of the loading of class is workload dependent, but
> it causes problem to generate such an allowlist.
>
>
>