Hi David, Chris and Serguei,

thank you all for your quick answers and recommendations!


Chris: Maybe this is just a terminology issue, but I'm not sure what you mean
by "lookup" > Chris: It sounds like you want to cache or store ObjectIDs instead of ObjectReferences, but later want to map the ObjectID back to an ObjectReference.

You are exactly right. Basically I would need to obtain some ObjectReference object that references a specific object id.

To give more context: I am storing a (partial) representation of the Java objects of the current program state in RDF <https://en.wikipedia.org/wiki/Resource_Description_Framework> format. This RDF graph is then further processed by external tooling. For example, the objects stored in it might be filtered according to some criteria. After that filtering, I might need to retrieve further information about a Java object through the JDI, given only the object id I have from the RDF graph.

> Chris: Why not just store the ObjectReference?

Yes, keeping a HashMap or something similar that maps object ids to cached ObjectReferences could work.

> Serguei: My guess is that you want it to save on memory overhead. Is it true?

True. I was unsure how well this scales for larger programs since I need to store an ObjectReference for every object in the JVM state. On the other hand, I imagine the JDWP agent internally implements some map structure anyway since it needs to keep track of the object IDs.

> Serguei: If so, is it worth the effort and extra complexity?

I guess I will need to give the caching solution a try and experiment a bit with larger programs.

> Chris: I think the main issue would be validation

True. By construction of the RDF graph, I probably will not run into object Ids that are not assigned to an object. Especially since I am not continuing execution while processing object ids. In any case, as Serguei pointed out, it should be possible to validate ids via the JDWP or at least deal with error cases.


Regarding the variable locations / code indices:

> Chris: Why do you want the slot?

I want to obtain the source code location of variables and store them in my RDF graph, too. In the end, I want to be able to ask the RDF graph, in which variables an object is stored, and where in the source code those variables are located.

If multiple variables share the same line location, that would not be an issue.


Ok, so for now, I conclude:

1. I might be able to get away with a caching solution for the object IDs and ObjectReferences, but I have to test this with some larger use cases.

2. Variable locations are stored internally and some might point to the same source code line. To access them I would definitely need to adapt the JDI implementation, though.   (Alternatively, I could use some external tool which gives me the AST of the source code with annotated locations. Then I could extract the variable locations from there using the variable name + method name + class name.)


Again, thank you all :)

Best regards,

Anton


On 11/24/21 02:29, Serguei Spitsyn wrote:

Hi Anton,

Thank you for the questions.

I don’t know the history well enough, so will try to guess a little bit.

Please, see my attempts to answer your questions inlined below.

*From: *serviceability-dev <serviceability-dev-r...@openjdk.java.net> on behalf of "Anton W. Haubner" <anton.haub...@outlook.de>
*Date: *Tuesday, November 23, 2021 at 1:25 AM
*To: *<serviceability-dev@openjdk.java.net>
*Cc: *Eduard Kamburjan <edu...@ifi.uio.no>
*Subject: *JDWP features hidden under JDI

Hello!

I am working on a new kind of debugger which extracts information about the
state of Java programs through the JDI to build RDF knowledge graphs.

     The project you are working on looks interesting.

While working on the project, I noticed that there is certain information about the program state that is accessible through JDWP, but which is hidden by the
JDI interfaces (see below for examples).

I am curious, whether this was done to simplify the interface, or if there is
a deeper reason behind this, e.g. because the information in question is
unreliable etc.
If there is no such reason, I might try to modify the JDI reference
implementation to provide this information to me.

*First Example: Retrieving Objects by ID*

The ObjectReference JDI interface does allow to retrieve the unique id assigned to
an object by the JDWP agent.

However, it seems it is not possible to construct an ObjectReference from such an id. That is, one can not quickly look up an object by its id, but has to
search through all objects to find it again.

Looking at the JDWP specification, it seems that the underlying JDWP protocol
does support looking up objects using just their id:
https://docs.oracle.com/en/java/javase/17/docs/specs/jdwp/jdwp-protocol.html#JDWP_ObjectReference

The reference implementation of the `ObjectReference` interface also seems to
only require this id to retrieve all required information:
https://github.com/openjdk/jdk/blob/dfacda488bfbe2e11e8d607a6d08527710286982/src/jdk.jdi/share/classes/com/sun/tools/jdi/ObjectReferenceImpl.java#L109

/My question now is:/
Is there a specific reason that there is no public factory method to construct
an ObjectReference from an object id?
Or would it be "safe" to create a custom `ObjectReference` implementation that allows this, as long as it deals with the `INVALID_OBJECT` error case of JDWP?

   My understanding is that the JDI just provides a way to store/cache the ObjectReference uniqueID() value.    I feel that I understand why you consider custom `ObjectReference` implementation API to be useful.    But as Chris already answered in order to implement it we need a validation of such uniqueID values.    The JDWP protocol ObjectReference#ReferenceType command looks like a good candidate to provide this verification.

   But there still is a question if this custom API is really needed and why do not cache ObjectReference’s instead.
   My guess is that you want it to save on memory overhead. Is it true?
   If so, is it worth the effort and extra complexity?


*Second Example: Variable Locations*

The JDWP `VariableTable` command reply does contain the code index of variables. Nevertheless, it is neither possible to retrieve the code index of a variable through the `LocalVariable` JDI interface, nor through the `Method` interface.

Meanwhile, internally, the `LocalVariable` reference implementation does seem
to store the scope of a variable:
https://github.com/openjdk/jdk/blob/dfacda488bfbe2e11e8d607a6d08527710286982/src/jdk.jdi/share/classes/com/sun/tools/jdi/LocalVariableImpl.java#L56

The Eclipse JDI implementation also stores the plain code index value:
https://git.eclipse.org/c/jdt/eclipse.jdt.debug.git/tree/org.eclipse.jdt.debug/jdi/org/eclipse/jdi/internal/LocalVariableImpl.java#n63

Is there a specific reason, why this location information is not exposed in the
public interface?

   My guess is that nobody was asking for this before or nobody seen reasonable use cases for it.
   It should not be difficult to add an API to provide this info.
   But again, the question is if your use cases can justify extra complexity.

   Thanks,
   Serguei


Thank you very much for reading my questions.
Can you help me to answer them?

Best regards,

Anton Haubner

Reply via email to