Re: Dynamic classloading in Geode
Is this going to be implementation for deploy jars. If yes then GEODE-17 (integrated security) will subject it for authorization scrutiny. On Fri, Aug 14, 2015 at 1:50 AM, Anthony Baker aba...@pivotal.io wrote: Thanks for the suggestions Mike. At this point we are just exploring ideas and putting them out for discussion. Regarding restricting access to this feature, we used the Geode Java client so standard security and authorizations would apply. Anthony On Aug 13, 2015, at 12:24 PM, Michael Stolz mst...@pivotal.io wrote: If this feature makes it into an actual release please make sure this option is not enabled by default and is securely turned off for environments where there are strong controls around releasing software into production. Also make sure that it is secured in terms of Authentication and Authorization via the Geode security framework when it is enabled, so that not just anyone can push code. -- Mike Stolz Principal Technical Account Manager Mobile: 631-835-4771 On Thu, Aug 13, 2015 at 1:57 PM, Anthony Baker aba...@pivotal.io mailto:aba...@pivotal.io wrote: Vito and I spent some time hacking up a prototype for dynamic and distributed classloading of Geode functions. Currently a user has to compile a function into a jar and deploy it using gfsh before it can be executed. If we could enable automatic deployment of functions across a running cluster it would speed up the development cycle for Geode applications and pave the way for other interesting features (like Java8 lambdas). Here’s how it works: A function wrapper (DynamicFunction) serializes the original function object and captures dependent classes as byte arrays. We generate an MD5 hash over the bytecode and use that as the key for storing the bytecode in a replicated region (“hackday”) within the cache. When the function is invoked, we call putIfAbsent() to distribute the byte code prior to executing the function across the cluster. During execution, we extend the TCCL with a new class loader that loads classes from our region while the original function is being deserialized. The original function is then executed in parallel on the cluster members. This allows an application developer to iteratively modify and test function code without any manual steps to upload class files. Obviously, there is a lot more thinking and design work to do around these ideas. Here’s our super-hacky code if you’re interested: https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 Caveats: 1) Currently we only capture static class dependencies. Any class dependencies present during method invocations are ignored. This could be addressed by doing byte code inspection (using ASM, javaassist, etc). 2) The region we use to cache class byte code should be automatically recreated as a metadata region, similar to how we store pdx types. We also need to configure eviction and expiration attributes to control resource usage and remove garbage. 3) We only injected the byte code caching hack into the code path for FunctionService.onServers(pool). Also, the putIfAbsent() call adds another network roundtrip. Anthony Vito
Dynamic classloading in Geode
Vito and I spent some time hacking up a prototype for dynamic and distributed classloading of Geode functions. Currently a user has to compile a function into a jar and deploy it using gfsh before it can be executed. If we could enable automatic deployment of functions across a running cluster it would speed up the development cycle for Geode applications and pave the way for other interesting features (like Java8 lambdas). Here’s how it works: A function wrapper (DynamicFunction) serializes the original function object and captures dependent classes as byte arrays. We generate an MD5 hash over the bytecode and use that as the key for storing the bytecode in a replicated region (“hackday”) within the cache. When the function is invoked, we call putIfAbsent() to distribute the byte code prior to executing the function across the cluster. During execution, we extend the TCCL with a new class loader that loads classes from our region while the original function is being deserialized. The original function is then executed in parallel on the cluster members. This allows an application developer to iteratively modify and test function code without any manual steps to upload class files. Obviously, there is a lot more thinking and design work to do around these ideas. Here’s our super-hacky code if you’re interested: https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 Caveats: 1) Currently we only capture static class dependencies. Any class dependencies present during method invocations are ignored. This could be addressed by doing byte code inspection (using ASM, javaassist, etc). 2) The region we use to cache class byte code should be automatically recreated as a metadata region, similar to how we store pdx types. We also need to configure eviction and expiration attributes to control resource usage and remove garbage. 3) We only injected the byte code caching hack into the code path for FunctionService.onServers(pool). Also, the putIfAbsent() call adds another network roundtrip. Anthony Vito
Re: Dynamic classloading in Geode
Thanks for the suggestions Mike. At this point we are just exploring ideas and putting them out for discussion. Regarding restricting access to this feature, we used the Geode Java client so standard security and authorizations would apply. Anthony On Aug 13, 2015, at 12:24 PM, Michael Stolz mst...@pivotal.io wrote: If this feature makes it into an actual release please make sure this option is not enabled by default and is securely turned off for environments where there are strong controls around releasing software into production. Also make sure that it is secured in terms of Authentication and Authorization via the Geode security framework when it is enabled, so that not just anyone can push code. -- Mike Stolz Principal Technical Account Manager Mobile: 631-835-4771 On Thu, Aug 13, 2015 at 1:57 PM, Anthony Baker aba...@pivotal.io mailto:aba...@pivotal.io wrote: Vito and I spent some time hacking up a prototype for dynamic and distributed classloading of Geode functions. Currently a user has to compile a function into a jar and deploy it using gfsh before it can be executed. If we could enable automatic deployment of functions across a running cluster it would speed up the development cycle for Geode applications and pave the way for other interesting features (like Java8 lambdas). Here’s how it works: A function wrapper (DynamicFunction) serializes the original function object and captures dependent classes as byte arrays. We generate an MD5 hash over the bytecode and use that as the key for storing the bytecode in a replicated region (“hackday”) within the cache. When the function is invoked, we call putIfAbsent() to distribute the byte code prior to executing the function across the cluster. During execution, we extend the TCCL with a new class loader that loads classes from our region while the original function is being deserialized. The original function is then executed in parallel on the cluster members. This allows an application developer to iteratively modify and test function code without any manual steps to upload class files. Obviously, there is a lot more thinking and design work to do around these ideas. Here’s our super-hacky code if you’re interested: https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 Caveats: 1) Currently we only capture static class dependencies. Any class dependencies present during method invocations are ignored. This could be addressed by doing byte code inspection (using ASM, javaassist, etc). 2) The region we use to cache class byte code should be automatically recreated as a metadata region, similar to how we store pdx types. We also need to configure eviction and expiration attributes to control resource usage and remove garbage. 3) We only injected the byte code caching hack into the code path for FunctionService.onServers(pool). Also, the putIfAbsent() call adds another network roundtrip. Anthony Vito
Re: Dynamic classloading in Geode
If this feature makes it into an actual release please make sure this option is not enabled by default and is securely turned off for environments where there are strong controls around releasing software into production. Also make sure that it is secured in terms of Authentication and Authorization via the Geode security framework when it is enabled, so that not just anyone can push code. -- Mike Stolz Principal Technical Account Manager Mobile: 631-835-4771 On Thu, Aug 13, 2015 at 1:57 PM, Anthony Baker aba...@pivotal.io wrote: Vito and I spent some time hacking up a prototype for dynamic and distributed classloading of Geode functions. Currently a user has to compile a function into a jar and deploy it using gfsh before it can be executed. If we could enable automatic deployment of functions across a running cluster it would speed up the development cycle for Geode applications and pave the way for other interesting features (like Java8 lambdas). Here’s how it works: A function wrapper (DynamicFunction) serializes the original function object and captures dependent classes as byte arrays. We generate an MD5 hash over the bytecode and use that as the key for storing the bytecode in a replicated region (“hackday”) within the cache. When the function is invoked, we call putIfAbsent() to distribute the byte code prior to executing the function across the cluster. During execution, we extend the TCCL with a new class loader that loads classes from our region while the original function is being deserialized. The original function is then executed in parallel on the cluster members. This allows an application developer to iteratively modify and test function code without any manual steps to upload class files. Obviously, there is a lot more thinking and design work to do around these ideas. Here’s our super-hacky code if you’re interested: https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 Caveats: 1) Currently we only capture static class dependencies. Any class dependencies present during method invocations are ignored. This could be addressed by doing byte code inspection (using ASM, javaassist, etc). 2) The region we use to cache class byte code should be automatically recreated as a metadata region, similar to how we store pdx types. We also need to configure eviction and expiration attributes to control resource usage and remove garbage. 3) We only injected the byte code caching hack into the code path for FunctionService.onServers(pool). Also, the putIfAbsent() call adds another network roundtrip. Anthony Vito