Vito and I spent some time hacking up a prototype for dynamic and distributed 
classloading of Geode functions.  Currently a user has to compile a function 
into a jar and deploy it using gfsh before it can be executed.  If we could 
enable automatic deployment of functions across a running cluster it would 
speed up the development cycle for Geode applications and pave the way for 
other interesting features (like Java8 lambdas).

Here’s how it works:

A function wrapper (DynamicFunction) serializes the original function object 
and captures dependent classes as byte arrays.  We generate an MD5 hash over 
the bytecode and use that as the key for storing the bytecode in a replicated 
region (“hackday”) within the cache.  When the function is invoked, we call 
putIfAbsent() to distribute the byte code prior to executing the function 
across the cluster.  During execution, we extend the TCCL with a new class 
loader that loads classes from our region while the original function is being 
deserialized.  The original function is then executed in parallel on the 
cluster members.  This allows an application developer to iteratively modify 
and test function code without any manual steps to upload class files.

Obviously, there is a lot more thinking and design work to do around these 
ideas.  Here’s our super-hacky code if you’re interested: 


1) Currently we only capture static class dependencies.  Any class dependencies 
present during method invocations are ignored.  This could be addressed by 
doing byte code inspection (using ASM, javaassist, etc).

2) The region we use to cache class byte code should be automatically recreated 
as a metadata region, similar to how we store pdx types.  We also need to 
configure eviction and expiration attributes to control resource usage and 
remove garbage.

3) We only injected the byte code caching hack into the code path for 
FunctionService.onServers(pool).  Also, the putIfAbsent() call adds another 
network roundtrip.

Anthony & Vito

Reply via email to