On 19/04/2016 18:00, Andrew Dinn wrote:
:
I have been building and testing on JDK9 Byteman using the EA program
releases and have not seen anything break so far. However, that may just
be because i) Byteman and the Byteman tests are only using runtime
classes from the java base module and ii) app code used in tests is not
modularized -- well, at least not using Jigsaw (see below).
It's good to hear that you are testing with the EA builds.
I think the main thing for Byteman, and this will be true at least some
other agents too, is that there will likely need to be updated to
support instrumentation with modules. We have a reasonable compatibility
story for existing JVM TI and java agents but as soon as you get into
instrumenting code to statically or reflectively access code in other
modules then it may require the agent to arrange for this access to be
allowed.
To get started, then I would suggest studying the documents,
presentations and recordings that we have linked from the Project Jigsaw
page. This should get you up to speed on the new concepts and how
accessibility works with modules. I think this has to a prerequisite to
having a more detailed discussion on how agents doing BCI can work with
modules.
My concern was based on statements in the requirements about limiting
reflective access according to module context i.e. code in module M
might lose the ability to access non-public members of classes belonging
to module M'.
Core reflection has always been specified to do the same access checks
as the Java Language and bytecode. The only thing you loose is the
sledge hammer that is setAccessible as it cannot be used to break into
non-exported packages. You can use it to get a non-public members in
your own module, you can use it to get to non-public members of types in
packages that others have exported to you, you just cannot use to break
into the parts of the module that the module author has decided not to
export to you.
Byteman relies on reflective access to get/put non-public
fields and to invoke non-public methods. I noticed in the Jigsaw code
base that setEnabled was checking the module of the caller. So, I wanted
to be sure that setEnabled would still work (at the very least would
still work so long as my Java agent code was making the call).
This is where the agent author needs to be creative. It may, for
example, inject bytecode into the victim module to invoke addExports and
give the burglar access.
:
Firstly, if the code injected into C.m includes reference to a JDK
runtime class C' in module M' and C' is not exported then can a
classloader lookup from the classloader of C fail? If so then this is
going to be a legacy compatibility problem.
There are no changes to visibility and so C' is visible, Class.forName
should work as before for example.
That said, if C' is in a package that is not exported to C then code in
C cannot access it. This is not specific to JDK modules, M' is any module.
Secondly, assume the reference to C' can be resolved allowing Byteman
agent code to obtain a Field instance fi' for a non-public field f' or
a Method instance mi' for a non-public method m' of C'. Will a call to
setEnabled on fi'/mi' be rejected because the members in question belong
to a module M' which Byteman agent classes do not belong to? It was this
possibility that was hinted at in the requirements and whihc set alarm
bells ringing.
setAccessible should fail. If Byteman is injecting code into C that uses
code reflection to access a member of C' then it will need to arrange
for M' to export the package containing C' to C.
Finally, will acesses/invocations of fi'/mi' from other code possibly be
rejected because the fields in question belong to a module M' which
accessing does not belong to?
The last question is relevant when Byteman executes injected code by
compiling it to bytecode rather than interpreting it. The generated
bytecode is attached to a dynamically generated class which means that
it cannot use get/put or invoke bytecodes to access private members. To
resolve this the bytecode is given access to the necessary instance fi'
or mi' when it needs to access/invoke non-public members. So, the
question is what happens when this dynamically generated class does not
belong to module M'? Will the Jigsaw JVM reject the reflective access
from this bytecode?
If setAccessible(true) has already been successfully called on the
Method of Field then handing it around is somewhat dangerous as it can
be used to invoke the method or access the field without an access
check. However if the generated code is being passed a Method or Field
where setAccessible(true) has not been called then there will be access
check.
:
There is already an extension to Byteman which supports the notion of
module imports for JBoss Modules. In that case injected code may import
JBoss modules M1, M2 etc. Types mentioned in the injected code are
resolved using a composite classloader buolt by delegating to the
clasloader for C and then the classloaders C1, C2 etc for M1, m2 etc.
Types are resolved firstly by looking them up in the target method's
classloader C then failing that by looking up in the classloaders C1, C2.
JBoss Modules provides a dynamic API to access and delegate to these
classloaders. Is there going to be some sort of equivalent for modules?
If so what restrictions will be placed on use of that API? I'd like to
be able to deal any legacy reference failures by providing a Jigsaw
module imports extension but I don't know whether that is going to be
possible without details of what API can be used to allow composite
classloaders to delegate class lookups into modules.
For the most part, this is a non-issue with the changes in JDK 9 because
we have not changed visibility. However it possible to instantiate
groups of modules where each module is defined to its own class loader.
Really advanced users can also defines modules to their class loaders as
they see fit.
If I read your paragraphs correctly then the instrumentation involves
injecting code that can only work if the class loader delegation is
augmented, is that right? There aren't hooks or other means in the API
to do this. To be honest, it sounds like a hazard that would need strong
use-cases.
Is this a Jigsaw extension to the java.lang.instrument API you are
talking about? If so can you point me at the code and.or javadoc?
The EA downloads have a link to the docs:
http://download.java.net/java/jdk9/docs/api/index.html
There is a new section on "Instrumenting code in modules" in
java.lang.instrument. The Instrumentation class defines a new method
addModuleReads to update a module to read another. This is not
interesting when you are injecting code that uses core reflection but
will be important when you inject bytecode with static references to
types in other modules. You'll need if you inject code that uses method
handles too.
The update JVM TI and JNI docs are here:
http://download.java.net/java/jdk9/docs/platform/jvmti/jvmti.html
http://download.java.net/java/jdk9/docs/technotes/guides/jni/spec/jniTOC.html
The JVM TI spec has a section "Bytecode Instrumentation of code in
modules". JNI has a few set of functions that are documented under
"Module Operations".
When you say "it can instrument code in A to have A reflectively read
B." it sounds like you are saying that my existing use of Members is
going to continue to work. Is that what you mean? O ram I being too
optimistic? :-)
If the injected code is using code reflection they you won't need to use
the API to reflective add read edges. This does not mean you won't need
to inject code to reflectively export packages through.
:
I don't follow what you are actually suggesting here. In what sense
would this 'export' the package. If you mean that I ned to transform B
in order to use it from A then what type of transformation would I need
to apply to a class B in order to allow A to access a private method m?
Would this involve adding a new method which was public? one that would
call the old one?
That's not an option since Byteman cannot make structural changes to
bytecode. It has to be able to transform classes which already exist
when the agent is loaded (it is a retransformer) and also needs to be
able to remove injected code and restore the status quo (e.g. for
testing different changes need to be present from one test to the next).
So, any requirement to change structure is not a solution.
I didn't suggest schema changes although load time instrumentation does
give you the opportunity to add initializers which could be useful to
some of the issues you might encounter.
:
I thought I asked for a get out for /JVMTI Java agents/ i.e. for /Java/
code loaded by the -javaagent command line option or the VM_Attach API
rather than for JVMTI Native agents. The former is certainly what I was
interested in.
Anyway, I think I have outlined what the problems are above. What I was
asking for was some way of bypassing these problems when calls were made
from agent code i.e. allowing reflective accesses to non-public Members
to proceed if it was known that the access was somehow sanctioned by
Java agent code.
So, if it turns out that usage of a Member from any invoking context is
constrained merely according to whether setAccessible(true) has been
successfully executed then could you ensure that the check as to whether
setAccessible should succeed or fail can identify that the caller
belongs to agent code and if so make it succeed.
Alternatively, if an access from the bytecode for method C.m of a Member
of class C' (either get, put or invoke) is constrained according to some
relation between the modules of C and C' in question then can you ensure
that when C belongs to agent code the access always succeeds?
The latter would not be enough to deal with potential restrictions on
the current generated bytecode but I can probably ensure that the
generated code calls into Byteman code to do the reflective access.
There is lots that we can comment on here but I think it would be better
to spend a bit of time coming up to speed on modules and encapsulation.
I think then we can continue at least part of this thread as you work
through how to update Byteman to arrange for the intended access to work.
-Alan