Re: Dynamic classloading in Geode

2015-08-14 Thread Tushar Khairnar
Is this going to be implementation for deploy jars. If yes then GEODE-17
(integrated security) will subject it for authorization scrutiny.

On Fri, Aug 14, 2015 at 1:50 AM, Anthony Baker aba...@pivotal.io wrote:

 Thanks for the suggestions Mike.  At this point we are just exploring
 ideas and putting them out for discussion.  Regarding restricting access to
 this feature, we used the Geode Java client so standard security and
 authorizations would apply.

 Anthony

  On Aug 13, 2015, at 12:24 PM, Michael Stolz mst...@pivotal.io wrote:
 
  If this feature makes it into an actual release please make sure this
  option is not enabled by default and is securely turned off for
  environments where there are strong controls around releasing software
 into
  production.
  Also make sure that it is secured in terms of Authentication and
  Authorization via the Geode security framework when it is enabled, so
 that
  not just anyone can push code.
 
  --
  Mike Stolz
  Principal Technical Account Manager
  Mobile: 631-835-4771
 
  On Thu, Aug 13, 2015 at 1:57 PM, Anthony Baker aba...@pivotal.io
 mailto:aba...@pivotal.io wrote:
 
  Vito and I spent some time hacking up a prototype for dynamic and
  distributed classloading of Geode functions.  Currently a user has to
  compile a function into a jar and deploy it using gfsh before it can be
  executed.  If we could enable automatic deployment of functions across a
  running cluster it would speed up the development cycle for Geode
  applications and pave the way for other interesting features (like Java8
  lambdas).
 
  Here’s how it works:
 
  A function wrapper (DynamicFunction) serializes the original function
  object and captures dependent classes as byte arrays.  We generate an
 MD5
  hash over the bytecode and use that as the key for storing the bytecode
 in
  a replicated region (“hackday”) within the cache.  When the function is
  invoked, we call putIfAbsent() to distribute the byte code prior to
  executing the function across the cluster.  During execution, we extend
 the
  TCCL with a new class loader that loads classes from our region while
 the
  original function is being deserialized.  The original function is then
  executed in parallel on the cluster members.  This allows an application
  developer to iteratively modify and test function code without any
 manual
  steps to upload class files.
 
  Obviously, there is a lot more thinking and design work to do around
 these
  ideas.  Here’s our super-hacky code if you’re interested:
  https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 
  https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 
 https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1
 
  Caveats:
 
  1) Currently we only capture static class dependencies.  Any class
  dependencies present during method invocations are ignored.  This could
 be
  addressed by doing byte code inspection (using ASM, javaassist, etc).
 
  2) The region we use to cache class byte code should be automatically
  recreated as a metadata region, similar to how we store pdx types.  We
 also
  need to configure eviction and expiration attributes to control resource
  usage and remove garbage.
 
  3) We only injected the byte code caching hack into the code path for
  FunctionService.onServers(pool).  Also, the putIfAbsent() call adds
 another
  network roundtrip.
 
 
  Anthony  Vito




Dynamic classloading in Geode

2015-08-13 Thread Anthony Baker
Vito and I spent some time hacking up a prototype for dynamic and distributed 
classloading of Geode functions.  Currently a user has to compile a function 
into a jar and deploy it using gfsh before it can be executed.  If we could 
enable automatic deployment of functions across a running cluster it would 
speed up the development cycle for Geode applications and pave the way for 
other interesting features (like Java8 lambdas).

Here’s how it works:

A function wrapper (DynamicFunction) serializes the original function object 
and captures dependent classes as byte arrays.  We generate an MD5 hash over 
the bytecode and use that as the key for storing the bytecode in a replicated 
region (“hackday”) within the cache.  When the function is invoked, we call 
putIfAbsent() to distribute the byte code prior to executing the function 
across the cluster.  During execution, we extend the TCCL with a new class 
loader that loads classes from our region while the original function is being 
deserialized.  The original function is then executed in parallel on the 
cluster members.  This allows an application developer to iteratively modify 
and test function code without any manual steps to upload class files.

Obviously, there is a lot more thinking and design work to do around these 
ideas.  Here’s our super-hacky code if you’re interested:
https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 
https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1

Caveats:

1) Currently we only capture static class dependencies.  Any class dependencies 
present during method invocations are ignored.  This could be addressed by 
doing byte code inspection (using ASM, javaassist, etc).

2) The region we use to cache class byte code should be automatically recreated 
as a metadata region, similar to how we store pdx types.  We also need to 
configure eviction and expiration attributes to control resource usage and 
remove garbage.

3) We only injected the byte code caching hack into the code path for 
FunctionService.onServers(pool).  Also, the putIfAbsent() call adds another 
network roundtrip.


Anthony  Vito

Re: Dynamic classloading in Geode

2015-08-13 Thread Anthony Baker
Thanks for the suggestions Mike.  At this point we are just exploring ideas and 
putting them out for discussion.  Regarding restricting access to this feature, 
we used the Geode Java client so standard security and authorizations would 
apply.

Anthony

 On Aug 13, 2015, at 12:24 PM, Michael Stolz mst...@pivotal.io wrote:
 
 If this feature makes it into an actual release please make sure this
 option is not enabled by default and is securely turned off for
 environments where there are strong controls around releasing software into
 production.
 Also make sure that it is secured in terms of Authentication and
 Authorization via the Geode security framework when it is enabled, so that
 not just anyone can push code.
 
 --
 Mike Stolz
 Principal Technical Account Manager
 Mobile: 631-835-4771
 
 On Thu, Aug 13, 2015 at 1:57 PM, Anthony Baker aba...@pivotal.io 
 mailto:aba...@pivotal.io wrote:
 
 Vito and I spent some time hacking up a prototype for dynamic and
 distributed classloading of Geode functions.  Currently a user has to
 compile a function into a jar and deploy it using gfsh before it can be
 executed.  If we could enable automatic deployment of functions across a
 running cluster it would speed up the development cycle for Geode
 applications and pave the way for other interesting features (like Java8
 lambdas).
 
 Here’s how it works:
 
 A function wrapper (DynamicFunction) serializes the original function
 object and captures dependent classes as byte arrays.  We generate an MD5
 hash over the bytecode and use that as the key for storing the bytecode in
 a replicated region (“hackday”) within the cache.  When the function is
 invoked, we call putIfAbsent() to distribute the byte code prior to
 executing the function across the cluster.  During execution, we extend the
 TCCL with a new class loader that loads classes from our region while the
 original function is being deserialized.  The original function is then
 executed in parallel on the cluster members.  This allows an application
 developer to iteratively modify and test function code without any manual
 steps to upload class files.
 
 Obviously, there is a lot more thinking and design work to do around these
 ideas.  Here’s our super-hacky code if you’re interested:
 https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 
 https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 
 https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1
 
 Caveats:
 
 1) Currently we only capture static class dependencies.  Any class
 dependencies present during method invocations are ignored.  This could be
 addressed by doing byte code inspection (using ASM, javaassist, etc).
 
 2) The region we use to cache class byte code should be automatically
 recreated as a metadata region, similar to how we store pdx types.  We also
 need to configure eviction and expiration attributes to control resource
 usage and remove garbage.
 
 3) We only injected the byte code caching hack into the code path for
 FunctionService.onServers(pool).  Also, the putIfAbsent() call adds another
 network roundtrip.
 
 
 Anthony  Vito



Re: Dynamic classloading in Geode

2015-08-13 Thread Michael Stolz
If this feature makes it into an actual release please make sure this
option is not enabled by default and is securely turned off for
environments where there are strong controls around releasing software into
production.
Also make sure that it is secured in terms of Authentication and
Authorization via the Geode security framework when it is enabled, so that
not just anyone can push code.

--
Mike Stolz
Principal Technical Account Manager
Mobile: 631-835-4771

On Thu, Aug 13, 2015 at 1:57 PM, Anthony Baker aba...@pivotal.io wrote:

 Vito and I spent some time hacking up a prototype for dynamic and
 distributed classloading of Geode functions.  Currently a user has to
 compile a function into a jar and deploy it using gfsh before it can be
 executed.  If we could enable automatic deployment of functions across a
 running cluster it would speed up the development cycle for Geode
 applications and pave the way for other interesting features (like Java8
 lambdas).

 Here’s how it works:

 A function wrapper (DynamicFunction) serializes the original function
 object and captures dependent classes as byte arrays.  We generate an MD5
 hash over the bytecode and use that as the key for storing the bytecode in
 a replicated region (“hackday”) within the cache.  When the function is
 invoked, we call putIfAbsent() to distribute the byte code prior to
 executing the function across the cluster.  During execution, we extend the
 TCCL with a new class loader that loads classes from our region while the
 original function is being deserialized.  The original function is then
 executed in parallel on the cluster members.  This allows an application
 developer to iteratively modify and test function code without any manual
 steps to upload class files.

 Obviously, there is a lot more thinking and design work to do around these
 ideas.  Here’s our super-hacky code if you’re interested:
 https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1 
 https://gist.github.com/metatype/9b1f39a24e52f5c6f3e1

 Caveats:

 1) Currently we only capture static class dependencies.  Any class
 dependencies present during method invocations are ignored.  This could be
 addressed by doing byte code inspection (using ASM, javaassist, etc).

 2) The region we use to cache class byte code should be automatically
 recreated as a metadata region, similar to how we store pdx types.  We also
 need to configure eviction and expiration attributes to control resource
 usage and remove garbage.

 3) We only injected the byte code caching hack into the code path for
 FunctionService.onServers(pool).  Also, the putIfAbsent() call adds another
 network roundtrip.


 Anthony  Vito