Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-30 Thread Aman Sharma
Hi Joe,


> As a general comment, it is _not_ the goal of the API specification to
(over) specify exact behavior in cases like this.

> See as an example the discussion concerning behavioral compatibility
starting around slide 46 of

> "Contributing to OpenJDK: Participating in stewardship for the long-term,"
https://jcp.org/aboutJava/communityprocess/ec-public/materials/2023-06-13/Contributing_to_OpenJDK_2023_04_12.pdf


> This approach has evolved over the years and releases.

> In this case semantically, the array returned by getMethod is a set and
the no particular meaning should be read into the order of the elements.

> HTH,


> -Joe

Missed this email of yours. Thanks for making it clear.


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: Aman Sharma
Sent: Wednesday, May 22, 2024 8:19:41 PM
To: Chen Liang
Cc: David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`


Hi,


Another thing I wanted to look into in this thread was the order of fields in 
the Proxy classes generated. They are also based on the a number. The same 
proxy classes across different executions can have random order of `Method` 
fields and the methods could be mapped to different field names.


For example, consider the proxy class based on 
`picocli.CommandLine<https://github.com/remkop/picocli/blob/da98db63d1b516141b7485881b0dcddfd082dbc8/src/main/java/picocli/CommandLine.java#L4541>`
 in two different executions.

// fields and method are truncated for brevity
public final class $Proxy9 extends Proxy implements CommandLine.Command {
private static Method m1;
private static Method m32;
private static Method m21;
private static Method m43;
private static Method m36;
private static Method m27;

public final boolean helpCommand() throws  {
try {
return (Boolean)super.h.invoke(this, m32, (Object[])null);
} catch (RuntimeException | Error var2) {
throw var2;
} catch (Throwable var3) {
throw new UndeclaredThrowableException(var3);
}
 }

// fields and method are truncated for brevity
public final class $Proxy13 extends Proxy implements CommandLine.Command {
private static Method m1;
private static Method m29;
private static Method m16;
private static Method m40;
private static Method m38;
private static Method m12;

public final boolean helpCommand() throws  {
try {
return (Boolean)super.h.invoke(this, m29, (Object[])null);
} catch (RuntimeException | Error var2) {
throw var2;
} catch (Throwable var3) {
throw new UndeclaredThrowableException(var3);
}
}


Notice the difference in the order of fields and `helpCommand` method is mapped 
to a different field name in both classes. This happens because the method 
array returned by `getMethods` is not sorted in any particular 
order<https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Class.java#L2178>
 when generating a proxy class. What dictates this order? And why is it not 
deterministic?


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: Aman Sharma
Sent: Wednesday, May 22, 2024 4:12:19 PM
To: Chen Liang
Cc: David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`


Hi Chen,


That's clear. Thanks for letting me know. I guess then Project Leyden is 
working on naming the hidden classes deterministically to achieve their 
goals<https://openjdk.org/projects/leyden/notes/01-beginnings>.


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: Chen Liang 
Sent: Wednesday, May 22, 2024 1:35:46 PM
To: Aman Sharma
Cc: David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming 

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-22 Thread Aman Sharma
Hi,


Another thing I wanted to look into in this thread was the order of fields in 
the Proxy classes generated. They are also based on the a number. The same 
proxy classes across different executions can have random order of `Method` 
fields and the methods could be mapped to different field names.


For example, consider the proxy class based on 
`picocli.CommandLine<https://github.com/remkop/picocli/blob/da98db63d1b516141b7485881b0dcddfd082dbc8/src/main/java/picocli/CommandLine.java#L4541>`
 in two different executions.

// fields and method are truncated for brevity
public final class $Proxy9 extends Proxy implements CommandLine.Command {
private static Method m1;
private static Method m32;
private static Method m21;
private static Method m43;
private static Method m36;
private static Method m27;

public final boolean helpCommand() throws  {
try {
return (Boolean)super.h.invoke(this, m32, (Object[])null);
} catch (RuntimeException | Error var2) {
throw var2;
} catch (Throwable var3) {
throw new UndeclaredThrowableException(var3);
}
 }

// fields and method are truncated for brevity
public final class $Proxy13 extends Proxy implements CommandLine.Command {
private static Method m1;
private static Method m29;
private static Method m16;
private static Method m40;
private static Method m38;
private static Method m12;

public final boolean helpCommand() throws  {
try {
return (Boolean)super.h.invoke(this, m29, (Object[])null);
} catch (RuntimeException | Error var2) {
throw var2;
} catch (Throwable var3) {
throw new UndeclaredThrowableException(var3);
}
}


Notice the difference in the order of fields and `helpCommand` method is mapped 
to a different field name in both classes. This happens because the method 
array returned by `getMethods` is not sorted in any particular 
order<https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/lang/Class.java#L2178>
 when generating a proxy class. What dictates this order? And why is it not 
deterministic?


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/
________
From: Aman Sharma
Sent: Wednesday, May 22, 2024 4:12:19 PM
To: Chen Liang
Cc: David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`


Hi Chen,


That's clear. Thanks for letting me know. I guess then Project Leyden is 
working on naming the hidden classes deterministically to achieve their 
goals<https://openjdk.org/projects/leyden/notes/01-beginnings>.


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/
____
From: Chen Liang 
Sent: Wednesday, May 22, 2024 1:35:46 PM
To: Aman Sharma
Cc: David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

Hi Aman,
We have tried defining Proxy as hidden classes; a previous attempt was on hold 
because of issues with serialization. Otherwise, Proxies work great as hidden 
classes.

Chen

On Mon, May 20, 2024 at 7:56 AM Aman Sharma 
mailto:aman...@kth.se>> wrote:

Hi David,


> I would not expect any class load
events.


I understand. I also haven't tried to intercept them but I see only one 
approach right now to include them in an allowlist - 1) statically look for 
invocations of "Lookup::defineHiddenClass". 2) Instrument them so that its 
first argument "bytes" can be looked into upon. I haven't looked into it much 
because I did not have much idea about it. And they are hidden so it made it 
worse. 😅 Thanks for sharing the JEP!


>

java.lang.reflect.Proxy could define hidden classes to act as the proxy classes 
which implement proxy interfaces; from JEP 317


It says that Proxy classes will also become hidden classes. Is it underway? 
Right now one can intercept, transform them, and include them in an allowlist. 
What do you think of naming them independent of AtomicLong so that a proxy 
class generated at runtime is easy to lookup in the allowlist?



Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical E

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-22 Thread Aman Sharma
Hi Chen,


That's clear. Thanks for letting me know. I guess then Project Leyden is 
working on naming the hidden classes deterministically to achieve their 
goals<https://openjdk.org/projects/leyden/notes/01-beginnings>.


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: Chen Liang 
Sent: Wednesday, May 22, 2024 1:35:46 PM
To: Aman Sharma
Cc: David Holmes; core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

Hi Aman,
We have tried defining Proxy as hidden classes; a previous attempt was on hold 
because of issues with serialization. Otherwise, Proxies work great as hidden 
classes.

Chen

On Mon, May 20, 2024 at 7:56 AM Aman Sharma 
mailto:aman...@kth.se>> wrote:

Hi David,


> I would not expect any class load
events.


I understand. I also haven't tried to intercept them but I see only one 
approach right now to include them in an allowlist - 1) statically look for 
invocations of "Lookup::defineHiddenClass". 2) Instrument them so that its 
first argument "bytes" can be looked into upon. I haven't looked into it much 
because I did not have much idea about it. And they are hidden so it made it 
worse. 😅 Thanks for sharing the JEP!


>

java.lang.reflect.Proxy could define hidden classes to act as the proxy classes 
which implement proxy interfaces; from JEP 317


It says that Proxy classes will also become hidden classes. Is it underway? 
Right now one can intercept, transform them, and include them in an allowlist. 
What do you think of naming them independent of AtomicLong so that a proxy 
class generated at runtime is easy to lookup in the allowlist?



Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/
____
From: David Holmes mailto:david.hol...@oracle.com>>
Sent: Monday, May 20, 2024 2:30:37 PM
To: Aman Sharma; liangchenb...@gmail.com<mailto:liangchenb...@gmail.com>
Cc: core-libs-dev@openjdk.org<mailto:core-libs-dev@openjdk.org>; 
leyden-...@openjdk.org<mailto:leyden-...@openjdk.org>
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

On 20/05/2024 10:12 pm, Aman Sharma wrote:
> Hi David,
>
>
>  > How did you try to intercept them? Hidden classes are not "loaded" in
> the normal sense so won't trigger class load events.
>
>
> I could not intercept them. I only see them when I pass `-verbose:class`
> in the Java CLI.

Yes that is why I asked how you tried to intercept them.

>
> I also couldn't intercept them using JVMTI Class File Load Hook
> <https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassFileLoadHook>
>  event. However JEP 371 suggests that it should be possible to intercept them 
> using JVMTI Class Load 
> <https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassLoad> 
> event, but I won't have the bytecode at this stage. So is there no way to get 
> its bytecode before it is linked and initialized in the JVM?

Hidden classes are not loaded so I would not expect any class load
events. However the exact nature of the JVMTI class load event is
unclear as it talks about "class or interface creation" which is neither
loading or defining per se. But a class prepare event sounds like it
should be issued. However neither give you access to the bytecode of the
class AFAICS.

David
-


>
> Regards,
> Aman Sharma
>
> PhD Student
> KTH Royal Institute of Technology
> School of Electrical Engineering and Computer Science (EECS)
> Department of Theoretical Computer Science (TCS)
> <http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
> <https://www.kth.se/profile/amansha>https://algomaster99.github.io/
> <https://algomaster99.github.io/>
> 
> *From:* David Holmes mailto:david.hol...@oracle.com>>
> *Sent:* Monday, May 20, 2024 2:59:17 AM
> *To:* Aman Sharma; liangchenb...@gmail.com<mailto:liangchenb...@gmail.com>
> *Cc:* core-libs-dev@openjdk.org<mailto:core-libs-dev@openjdk.org>; 
> leyden-...@openjdk.org<ma

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-20 Thread Aman Sharma
Hi David,


> I would not expect any class load
events.


I understand. I also haven't tried to intercept them but I see only one 
approach right now to include them in an allowlist - 1) statically look for 
invocations of "Lookup::defineHiddenClass". 2) Instrument them so that its 
first argument "bytes" can be looked into upon. I haven't looked into it much 
because I did not have much idea about it. And they are hidden so it made it 
worse. 😅 Thanks for sharing the JEP!


>

java.lang.reflect.Proxy could define hidden classes to act as the proxy classes 
which implement proxy interfaces; from JEP 317


It says that Proxy classes will also become hidden classes. Is it underway? 
Right now one can intercept, transform them, and include them in an allowlist. 
What do you think of naming them independent of AtomicLong so that a proxy 
class generated at runtime is easy to lookup in the allowlist?



Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: David Holmes 
Sent: Monday, May 20, 2024 2:30:37 PM
To: Aman Sharma; liangchenb...@gmail.com
Cc: core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

On 20/05/2024 10:12 pm, Aman Sharma wrote:
> Hi David,
>
>
>  > How did you try to intercept them? Hidden classes are not "loaded" in
> the normal sense so won't trigger class load events.
>
>
> I could not intercept them. I only see them when I pass `-verbose:class`
> in the Java CLI.

Yes that is why I asked how you tried to intercept them.

>
> I also couldn't intercept them using JVMTI Class File Load Hook
> <https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassFileLoadHook>
>  event. However JEP 371 suggests that it should be possible to intercept them 
> using JVMTI Class Load 
> <https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassLoad> 
> event, but I won't have the bytecode at this stage. So is there no way to get 
> its bytecode before it is linked and initialized in the JVM?

Hidden classes are not loaded so I would not expect any class load
events. However the exact nature of the JVMTI class load event is
unclear as it talks about "class or interface creation" which is neither
loading or defining per se. But a class prepare event sounds like it
should be issued. However neither give you access to the bytecode of the
class AFAICS.

David
-


>
> Regards,
> Aman Sharma
>
> PhD Student
> KTH Royal Institute of Technology
> School of Electrical Engineering and Computer Science (EECS)
> Department of Theoretical Computer Science (TCS)
> <http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
> <https://www.kth.se/profile/amansha>https://algomaster99.github.io/
> <https://algomaster99.github.io/>
> 
> *From:* David Holmes 
> *Sent:* Monday, May 20, 2024 2:59:17 AM
> *To:* Aman Sharma; liangchenb...@gmail.com
> *Cc:* core-libs-dev@openjdk.org; leyden-...@openjdk.org
> *Subject:* Re: Deterministic naming of subclasses of
> `java/lang/reflect/Proxy`
> On 17/05/2024 9:43 pm, Aman Sharma wrote:
>> Hi Chen,
>>
>>  > java.lang.invoke.LambdaForm$MH/0x0200cc000400
>>
>> I do see this as output when I pass -verbose:class. However, based on my
>> experiments, I have seen that neither an agent passed via 'javaagent'
>> nor an agent passed via 'agentpath' is able to intercept this hidden class.
>
> How did you try to intercept them? Hidden classes are not "loaded" in
> the normal sense so won't trigger class load events.
>
>> Also, I was a bit confused since I saw somewhere that the names of
>> hidden classes are null. But thanks for clarifying here.
>
> The JEP clearly defines the name format for hidden classes - though the
> final component is VM specific (and typically a hashcode).
>
> https://openjdk.org/jeps/371 <https://openjdk.org/jeps/371>
>
> Cheers,
> David
> -
>
>>  > avoid dynamic class loading
>>
>> I don't see dynamic class loading as a problem. I only mind some
>> unstable generation aspects of them which make it hard to verify them
>> based on an allowlist.
>>
>> For example, if this hidden class is generated with the exact same 

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-20 Thread Aman Sharma
Hi David,


> How did you try to intercept them? Hidden classes are not "loaded" in
the normal sense so won't trigger class load events.


I could not intercept them. I only see them when I pass `-verbose:class` in the 
Java CLI.


I also couldn't intercept them using JVMTI Class File Load 
Hook<https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassFileLoadHook>
 event. However JEP 371 suggests that it should be possible to intercept them 
using JVMTI Class 
Load<https://docs.oracle.com/en/java/javase/21/docs/specs/jvmti.html#ClassLoad> 
event, but I won't have the bytecode at this stage. So is there no way to get 
its bytecode before it is linked and initialized in the JVM?


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/
____
From: David Holmes 
Sent: Monday, May 20, 2024 2:59:17 AM
To: Aman Sharma; liangchenb...@gmail.com
Cc: core-libs-dev@openjdk.org; leyden-...@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

On 17/05/2024 9:43 pm, Aman Sharma wrote:
> Hi Chen,
>
>  > java.lang.invoke.LambdaForm$MH/0x0200cc000400
>
> I do see this as output when I pass -verbose:class. However, based on my
> experiments, I have seen that neither an agent passed via 'javaagent'
> nor an agent passed via 'agentpath' is able to intercept this hidden class.

How did you try to intercept them? Hidden classes are not "loaded" in
the normal sense so won't trigger class load events.

> Also, I was a bit confused since I saw somewhere that the names of
> hidden classes are null. But thanks for clarifying here.

The JEP clearly defines the name format for hidden classes - though the
final component is VM specific (and typically a hashcode).

https://openjdk.org/jeps/371

Cheers,
David
-

>  > avoid dynamic class loading
>
> I don't see dynamic class loading as a problem. I only mind some
> unstable generation aspects of them which make it hard to verify them
> based on an allowlist.
>
> For example, if this hidden class is generated with the exact same name
> and the exact same bytecode during runtime as well, it would be easy to
> verify it. However, I do see the names are based on some sort of memory
> address so and I don't know what bytecode it has so I don't have
> suggestions to make them stable as of now. For Proxy classes, I feel it
> can be addressed unless you disagree or some involved in Project Leyden
> does. :) Thank you for forwarding my mail there.
>
> Regards,
> Aman Sharma
>
> PhD Student
> KTH Royal Institute of Technology
> https://algomaster99.github.io/ <https://algomaster99.github.io/>
>
> 
> *From:* liangchenb...@gmail.com 
> *Sent:* Friday, May 17, 2024 1:23:58 pm
> *To:* Aman Sharma 
> *Cc:* core-libs-dev@openjdk.org ;
> leyden-...@openjdk.org 
> *Subject:* Re: Deterministic naming of subclasses of
> `java/lang/reflect/Proxy`
>
> Hi Aman,
> For `-verbose:class`, it's a JVM argument instead of a program argument;
> so when you run a java program like `java Main`, you should call it as
> `java -verbose:class Main`.
> When done correctly, you should see hidden class outputs like:
> [0.032s][info][class,load]
> java.lang.invoke.LambdaForm$MH/0x0200cc000400 source:
> __JVM_LookupDefineClass__
> The loading of java.lang.invoke hidden classes requires your program to
> use MethodHandle features, like a lambda.
>
> I think the problem you are exploring, that to avoid dynamic class
> loading and effectively turn Java Platform closed for security, is also
> being accomplished by project Leyden (as I've shared initially); Thus, I
> am forwarding this to leyden-dev instead, so you can see what approach
> Leyden uses to accomplish the same goal as yours.
>
> Regards, Chen Liang
>
> On Fri, May 17, 2024 at 4:40 AM Aman Sharma  <mailto:aman...@kth.se>> wrote:
>
> __
>
> Hi Roger,
>
>
> Do you have ideas on how to intercept them? My javaagent is not able
> to nor a JVMTI agent passed using `agentpath` option. It also does
> not seem to show up in logs when I pass `-verbose:class`.
>
>
> Also, what do you think of renaming the proxy classes as suggested
> below?
>
>
> Regards,
> Aman Sharma
>
> PhD Stu

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-17 Thread Aman Sharma
Hi Chen,

> java.lang.invoke.LambdaForm$MH/0x0200cc000400

I do see this as output when I pass -verbose:class. However, based on my 
experiments, I have seen that neither an agent passed via 'javaagent' nor an 
agent passed via 'agentpath' is able to intercept this hidden class.

Also, I was a bit confused since I saw somewhere that the names of hidden 
classes are null. But thanks for clarifying here.

> avoid dynamic class loading

I don't see dynamic class loading as a problem. I only mind some unstable 
generation aspects of them which make it hard to verify them based on an 
allowlist.

For example, if this hidden class is generated with the exact same name and the 
exact same bytecode during runtime as well, it would be easy to verify it. 
However, I do see the names are based on some sort of memory address so and I 
don't know what bytecode it has so I don't have suggestions to make them stable 
as of now. For Proxy classes, I feel it can be addressed unless you disagree or 
some involved in Project Leyden does. :) Thank you for forwarding my mail there.

Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
https://algomaster99.github.io/


From: liangchenb...@gmail.com 
Sent: Friday, May 17, 2024 1:23:58 pm
To: Aman Sharma 
Cc: core-libs-dev@openjdk.org ; 
leyden-...@openjdk.org 
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

Hi Aman,
For `-verbose:class`, it's a JVM argument instead of a program argument; so 
when you run a java program like `java Main`, you should call it as `java 
-verbose:class Main`.
When done correctly, you should see hidden class outputs like:
[0.032s][info][class,load] java.lang.invoke.LambdaForm$MH/0x0200cc000400 
source: __JVM_LookupDefineClass__
The loading of java.lang.invoke hidden classes requires your program to use 
MethodHandle features, like a lambda.

I think the problem you are exploring, that to avoid dynamic class loading and 
effectively turn Java Platform closed for security, is also being accomplished 
by project Leyden (as I've shared initially); Thus, I am forwarding this to 
leyden-dev instead, so you can see what approach Leyden uses to accomplish the 
same goal as yours.

Regards, Chen Liang

On Fri, May 17, 2024 at 4:40 AM Aman Sharma 
mailto:aman...@kth.se>> wrote:

Hi Roger,


Do you have ideas on how to intercept them? My javaagent is not able to nor a 
JVMTI agent passed using `agentpath` option. It also does not seem to show up 
in logs when I pass `-verbose:class`.


Also, what do you think of renaming the proxy classes as suggested below?


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: core-libs-dev 
mailto:core-libs-dev-r...@openjdk.org>> on 
behalf of Roger Riggs mailto:roger.ri...@oracle.com>>
Sent: Friday, May 17, 2024 4:57:46 AM
To: core-libs-dev@openjdk.org<mailto:core-libs-dev@openjdk.org>
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

Hi Aman,

You may also run into hidden classes (JEP 371: Hidden Classes) that allow 
classes to be defined, at runtime, without names.
It has been proposed to use them for generated proxies but that hasn't been 
implemented yet.
There are benefits to having nameless classes, because they can't be referenced 
by name, only as a capability, they can be better encapsulated.

fyi, Roger Riggs


On 5/16/24 8:11 AM, Aman Sharma wrote:

Hi,


Thanks for your response, Liang!


> I think you meant CVE-2021-42392 instead of 2022.


Sorry of the error. I indeed meant 
CVE-2021-42392<https://nvd.nist.gov/vuln/detail/cve-2021-42392>.


> Leyden mainly avoids this unstable generation by performing a training run to 
> collect classes loaded


Would love to know the details of Project Leyden and how they worked so far to 
focus on this goal. In our case, the training run is the test suite.


> GeneratedConstructorAccessor is already retired by JEP 416 [2] in Java 18


I did see them not appearing in my allowlist when I ran my study subject 
(Apache PDFBox) with Java 21. Thanks for letting me know about this JEP. I see 
they are re-implemented with method handles.


> How are you checking the classes?


To detect runtime generated code, we have javaagent that is hooked statically 
to the test suite execution. It gives us all classes that that is loaded post 
the JVM and the javaagent are loaded. So we only check the classes loaded for 
the purpose of running the application. This is also why we did not choose 
-agentlib as it would give classes for

Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-17 Thread Aman Sharma
Hi Roger,


Do you have ideas on how to intercept them? My javaagent is not able to nor a 
JVMTI agent passed using `agentpath` option. It also does not seem to show up 
in logs when I pass `-verbose:class`.


Also, what do you think of renaming the proxy classes as suggested below?


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: core-libs-dev  on behalf of Roger Riggs 

Sent: Friday, May 17, 2024 4:57:46 AM
To: core-libs-dev@openjdk.org
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

Hi Aman,

You may also run into hidden classes (JEP 371: Hidden Classes) that allow 
classes to be defined, at runtime, without names.
It has been proposed to use them for generated proxies but that hasn't been 
implemented yet.
There are benefits to having nameless classes, because they can't be referenced 
by name, only as a capability, they can be better encapsulated.

fyi, Roger Riggs


On 5/16/24 8:11 AM, Aman Sharma wrote:

Hi,


Thanks for your response, Liang!


> I think you meant CVE-2021-42392 instead of 2022.


Sorry of the error. I indeed meant 
CVE-2021-42392<https://nvd.nist.gov/vuln/detail/cve-2021-42392>.


> Leyden mainly avoids this unstable generation by performing a training run to 
> collect classes loaded


Would love to know the details of Project Leyden and how they worked so far to 
focus on this goal. In our case, the training run is the test suite.


> GeneratedConstructorAccessor is already retired by JEP 416 [2] in Java 18


I did see them not appearing in my allowlist when I ran my study subject 
(Apache PDFBox) with Java 21. Thanks for letting me know about this JEP. I see 
they are re-implemented with method handles.


> How are you checking the classes?


To detect runtime generated code, we have javaagent that is hooked statically 
to the test suite execution. It gives us all classes that that is loaded post 
the JVM and the javaagent are loaded. So we only check the classes loaded for 
the purpose of running the application. This is also why we did not choose 
-agentlib as it would give classes for the setting up JVM and javaagent and we 
the user of our tool must the classes they load.


Next, we have a `ClassFileTransformer` hook in the agent where we produce the 
checksum using the bytecode. And we compare the checksum with the one existing 
in the allowlist. The checksum computation algorithm is same for both steps. 
Let me describe how I compute the checksum.


  1.  I get the 
CONSTANT_Class_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.1>
 entry corresponding to `this_class` and rewrite the 
CONSTANT_Utf8_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.7>
 corresponding to a fix String constant, say "foo".
  2.  Since, the name of the class is used to refer to its types members 
(fields/method), I get all 
CONSTANT_Fieldref_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.2>
 and if its `class_index` corresponds to the old `this_class`, we rewrite the 
UTF8 value of class_index to the same constant "foo".
  3.  Next, since the naming of the fields, in Proxy classes, are also suffixed 
by numbers, for example, `private static Method m4`, we rewrite the UTF8 value 
of name in the 
CONSTANT_NameAndType_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.6>.
  4.  These fields can also have a random order so we simply sort the entire 
byte code using `Arrays.sort(byte[])` to eliminate any differences due to 
ordering of fields/methods.
  5.  Simply sorting the byte array still had minute differences. I could not 
understand why they existed even though values in constant pool of the bytecode 
in allowlist and at runtime were exactly the same after rewriting. The 
differences existed in the bytes of the Code attribute of methods. I concluded 
that the bytes stored some position information. To avoid this, I created a 
subarray where I considered the bytes corresponding to 
`CONSTANT_Utf8_info.bytes` only. Computing a checksum for it resulted in the 
same checksums for both classfiles.

Let's understand the whole approach with an example of Proxy class.

`

public final class $Proxy42 extends Proxy implements 
org.apache.logging.log4j.core.config.plugins.Plugin {

`

The will go in the allowlist as "Proxy_Plugin: ".

When the same class is intercepted at runtime, say "$Proxy10", we look for 
"Proxy_Plugin" in the allowlist and since the checksum algorithm is same in 
both cases, we g

Re: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-16 Thread Aman Sharma
Hi,


> have not looked into LambdaMetafactory because I did not encounter it as a 
> problem so far


It is possible that java agents are unable to intercept it. `-verbose:class` 
logs classes such as 
"org.apache.pdfbox.cos.COSDocument$$Lambda/0x7a80631a0d08".


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/
________
From: Aman Sharma
Sent: Thursday, May 16, 2024 2:11:59 PM
To: liangchenb...@gmail.com; core-libs-dev
Cc: Martin Monperrus
Subject: Re: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`


Hi,


Thanks for your response, Liang!


> I think you meant CVE-2021-42392 instead of 2022.


Sorry of the error. I indeed meant 
CVE-2021-42392<https://nvd.nist.gov/vuln/detail/cve-2021-42392>.


> Leyden mainly avoids this unstable generation by performing a training run to 
> collect classes loaded


Would love to know the details of Project Leyden and how they worked so far to 
focus on this goal. In our case, the training run is the test suite.


> GeneratedConstructorAccessor is already retired by JEP 416 [2] in Java 18


I did see them not appearing in my allowlist when I ran my study subject 
(Apache PDFBox) with Java 21. Thanks for letting me know about this JEP. I see 
they are re-implemented with method handles.


> How are you checking the classes?


To detect runtime generated code, we have javaagent that is hooked statically 
to the test suite execution. It gives us all classes that that is loaded post 
the JVM and the javaagent are loaded. So we only check the classes loaded for 
the purpose of running the application. This is also why we did not choose 
-agentlib as it would give classes for the setting up JVM and javaagent and we 
the user of our tool must the classes they load.


Next, we have a `ClassFileTransformer` hook in the agent where we produce the 
checksum using the bytecode. And we compare the checksum with the one existing 
in the allowlist. The checksum computation algorithm is same for both steps. 
Let me describe how I compute the checksum.


  1.  I get the 
CONSTANT_Class_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.1>
 entry corresponding to `this_class` and rewrite the 
CONSTANT_Utf8_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.7>
 corresponding to a fix String constant, say "foo".
  2.  Since, the name of the class is used to refer to its types members 
(fields/method), I get all 
CONSTANT_Fieldref_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.2>
 and if its `class_index` corresponds to the old `this_class`, we rewrite the 
UTF8 value of class_index to the same constant "foo".
  3.  Next, since the naming of the fields, in Proxy classes, are also suffixed 
by numbers, for example, `private static Method m4`, we rewrite the UTF8 value 
of name in the 
CONSTANT_NameAndType_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.6>.
  4.  These fields can also have a random order so we simply sort the entire 
byte code using `Arrays.sort(byte[])` to eliminate any differences due to 
ordering of fields/methods.
  5.  Simply sorting the byte array still had minute differences. I could not 
understand why they existed even though values in constant pool of the bytecode 
in allowlist and at runtime were exactly the same after rewriting. The 
differences existed in the bytes of the Code attribute of methods. I concluded 
that the bytes stored some position information. To avoid this, I created a 
subarray where I considered the bytes corresponding to 
`CONSTANT_Utf8_info.bytes` only. Computing a checksum for it resulted in the 
same checksums for both classfiles.

Let's understand the whole approach with an example of Proxy class.

`

public final class $Proxy42 extends Proxy implements 
org.apache.logging.log4j.core.config.plugins.Plugin {

`

The will go in the allowlist as "Proxy_Plugin: ".

When the same class is intercepted at runtime, say "$Proxy10", we look for 
"Proxy_Plugin" in the allowlist and since the checksum algorithm is same in 
both cases, we get a match and let the class load.

This approach has seemed to work well for Proxy classes, Generated Constructor 
Accessor (which is removed as you said). I also looked at the species generated 
by method handles. I did not notice any modification in them. Their name 
generation seemed okay to me. If some new Species are generated, it is of 
course detected since it is not in the allowlist.

I have not looked into LambdaMetafactory 

Re: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

2024-05-16 Thread Aman Sharma
Hi,


Thanks for your response, Liang!


> I think you meant CVE-2021-42392 instead of 2022.


Sorry of the error. I indeed meant 
CVE-2021-42392<https://nvd.nist.gov/vuln/detail/cve-2021-42392>.


> Leyden mainly avoids this unstable generation by performing a training run to 
> collect classes loaded


Would love to know the details of Project Leyden and how they worked so far to 
focus on this goal. In our case, the training run is the test suite.


> GeneratedConstructorAccessor is already retired by JEP 416 [2] in Java 18


I did see them not appearing in my allowlist when I ran my study subject 
(Apache PDFBox) with Java 21. Thanks for letting me know about this JEP. I see 
they are re-implemented with method handles.


> How are you checking the classes?


To detect runtime generated code, we have javaagent that is hooked statically 
to the test suite execution. It gives us all classes that that is loaded post 
the JVM and the javaagent are loaded. So we only check the classes loaded for 
the purpose of running the application. This is also why we did not choose 
-agentlib as it would give classes for the setting up JVM and javaagent and we 
the user of our tool must the classes they load.


Next, we have a `ClassFileTransformer` hook in the agent where we produce the 
checksum using the bytecode. And we compare the checksum with the one existing 
in the allowlist. The checksum computation algorithm is same for both steps. 
Let me describe how I compute the checksum.


  1.  I get the 
CONSTANT_Class_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.1>
 entry corresponding to `this_class` and rewrite the 
CONSTANT_Utf8_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.7>
 corresponding to a fix String constant, say "foo".
  2.  Since, the name of the class is used to refer to its types members 
(fields/method), I get all 
CONSTANT_Fieldref_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.2>
 and if its `class_index` corresponds to the old `this_class`, we rewrite the 
UTF8 value of class_index to the same constant "foo".
  3.  Next, since the naming of the fields, in Proxy classes, are also suffixed 
by numbers, for example, `private static Method m4`, we rewrite the UTF8 value 
of name in the 
CONSTANT_NameAndType_info<https://docs.oracle.com/javase/specs/jvms/se11/html/jvms-4.html#jvms-4.4.6>.
  4.  These fields can also have a random order so we simply sort the entire 
byte code using `Arrays.sort(byte[])` to eliminate any differences due to 
ordering of fields/methods.
  5.  Simply sorting the byte array still had minute differences. I could not 
understand why they existed even though values in constant pool of the bytecode 
in allowlist and at runtime were exactly the same after rewriting. The 
differences existed in the bytes of the Code attribute of methods. I concluded 
that the bytes stored some position information. To avoid this, I created a 
subarray where I considered the bytes corresponding to 
`CONSTANT_Utf8_info.bytes` only. Computing a checksum for it resulted in the 
same checksums for both classfiles.

Let's understand the whole approach with an example of Proxy class.

`

public final class $Proxy42 extends Proxy implements 
org.apache.logging.log4j.core.config.plugins.Plugin {

`

The will go in the allowlist as "Proxy_Plugin: ".

When the same class is intercepted at runtime, say "$Proxy10", we look for 
"Proxy_Plugin" in the allowlist and since the checksum algorithm is same in 
both cases, we get a match and let the class load.

This approach has seemed to work well for Proxy classes, Generated Constructor 
Accessor (which is removed as you said). I also looked at the species generated 
by method handles. I did not notice any modification in them. Their name 
generation seemed okay to me. If some new Species are generated, it is of 
course detected since it is not in the allowlist.

I have not looked into LambdaMetafactory because I did not encounter it as a 
problem so far, but I am aware its name generation is also unstable. I have run 
my approach only a few projects only. And for hidden classes, I assume the the 
agent won't be able to intercept them so detecting them would be really hard.


Regards,
Aman Sharma

PhD Student
KTH Royal Institute of Technology
School of Electrical Engineering and Computer Science (EECS)
Department of Theoretical Computer Science (TCS)
<http://www.kth.se><https://www.kth.se/profile/amansha><https://www.kth.se/profile/amansha>
<https://www.kth.se/profile/amansha>https://algomaster99.github.io/

From: liangchenb...@gmail.com 
Sent: Thursday, May 16, 2024 5:52:03 AM
To: Aman Sharma; core-libs-dev
Cc: Martin Monperrus
Subject: Re: Deterministic naming of subclasses of `java/lang/reflect/Proxy`

Hi