> On May 25, 2019, at 6:27 PM, David Blevins <david.blev...@gmail.com> wrote:
> 
> # Revisiting TomEE Embedded
> 
> Already in the data, we can see 50% of our "new startup" time is extracting 
> the server from the tar.gz.  We could speed things up by 50% for our build 
> and everyone's build, just by skipping that step.
> 
> We have always had a TomEE Embedded distribution, but we've approached it 
> with a completely different mindset than regular TomEE and therefore it has 
> less functionality and simply could not complete with a TomEE zip.  
> 
> I believe we took a wrong turn at the start.
> 
> In a plain Tomcat zip the server is started with an incredibly small 
> classpath, basically just what is the bin/ dir.  Tomcat will then load 
> everything in the lib/ dir into a new classloader, grab a class from that new 
> classloader and tell it to finish the job.
> 
> It would be possible for us to write an "embedded" version that does exactly 
> this, but using jars from the local Maven repo and would result in is being 
> able to start stop many Tomcat/TomEE instances in one JVM.  We would have a 
> slightly different version of the Tomcat bootstrap jar, but everything after 
> that point would be 100% the same.  The embedded version would start 2x 
> faster than a "remote" version but have the same functionality and cost us 
> very little in terms of maintenance.

I've created a prototype of the above idea on a new embedded version of TomEE.

The concepts at the heart of it are:

 - create a pom.xml for each server.zip that has a 1-to-1 mapping of every 
library in that zip
 - create a bootstrapper based on `org.apache.catalina.startup.Bootstrap` that 
can bootstrap a server using the classpath created by that pom
 - enjoy a server environment identical to the standalone zip, but running 
inside your JVM in an embedded or "serverless" fashion

See the following two docs for an overview of what it looks like:

 - 
https://github.com/apache/tomee/tree/master/examples/serverless-tomee-webprofile#serverless-tomee-webprofile
 - 
https://github.com/apache/tomee/tree/master/examples/serverless-builder#serverless-builder-api

Essentially there is a functional API that allows you to build an embedded 
server instance by passing in lambdas and method references that assist in the 
building.  The approach still involves a very small number of config files in a 
temporary directory that represents the catalina.base.

Where many embedded/serverless APIs go wrong is taking the perspective that any 
kind of configuration file is bad.  When this perspective is taken, users must 
learn an elaborate API, usually a DSL, to build every aspect of the server 
instance by hand.

The problem with this approach is that it takes an enormous effort to create 
these DSLs and get them to cover all possibilities in how a server can be 
configured.  It sounds neat until you realize that you'd need a DSL to cover 
all the configuration files below:

 - catalina.policy
 - catalina.properties
 - context.xml
 - jaspic-providers.xml
 - logging.properties
 - server.xml
 - system.properties
 - tomcat-users.xml
 - tomee.xml
 - web.xml

The effort to cover all that and properly document it is high.  The 
documentation required is also enormously high.  And while it's all very neat, 
the amount of effort a user must spend to learn it all is also quite high.

And we go through all this effort as creators and users, just to get a version 
of the server we love that runs in our same VM as opposed to being a separate 
process.

If the goal is simply to get a Tomcat/TomEE running in your JVM so you can 
right a test case or serverless app, there is a shorter path.

When you boot Tomcat from the command line there is actually a very 
intentionally thin class called `Bootstrap` that does very little aside from 
constructing a ClassLoader out of the jars in lib/*.jar and bin/*.jar and then 
constructs an instance of `Catalina` and starts it.  The real heavy lifting is 
done by Catalina from inside the newly constructed ClassLoader.

The groundbreaking idea is that the simplest possible version of a Tomcat 
embedded is just a version of Bootstrap with some of those steps removed.

When you take `Bootstrap` and pull out the classloader construction and the 
reflection needed to call `Catalina`, you really only have this code left:

 - 
https://github.com/apache/tomee/blob/master/tomee/tomee-bootstrap/src/main/java/org/apache/tomee/bootstrap/Server.java#L239-L247

From there you really only have two challenges, 1) ensure the right jars are in 
the classpath and 2) ensure there are config files on disk somewhere Tomcat can 
find.

For challenge #1, the art of perfectly constructing a classpath that would be 
100% identical to the corresponding Apache TomEE zip is actually a bit hard.  
The primary reason is Maven's transitive dependencies.  To tackle this I wrote 
a tool that will extract a TomEE zip, analyze the libraries in it, map them 
back to Maven coordinates, then generate a pom.xml that includes *only* those 
jars with all transitive dependencies excluded.

The class that will do it is here (first link) and an example of its results 
are here (second link):

 - 
https://github.com/apache/tomee/blob/master/tomee/tomee-bootstrap/src/test/java/org/apache/tomee/bootstrap/GenerateBoms.java
 - 
https://github.com/apache/tomee/blob/master/boms/tomee-webprofile/pom.xml#L48-L1379

For challenge #2, we leverage the fact that Tomcat is already very happy to 
look anywhere for configuration files and webbaps.  Tomcat has the concept of a 
home and a base.  Home being where all the server libraries live and base being 
where the configuration and application lives.  Instead of elaborate DSLs we 
simply create a very tiny catalina.base in a temp directory.  To do that we 
just need to make adding and modifying files to that catalina.base a little 
easier.  Here's where `Archive` enters the picture and effectively is a very 
tiny API that allows a directory structure to be built in memory.

 - 
https://github.com/apache/tomee/blob/master/tomee/tomee-bootstrap/src/main/java/org/apache/tomee/bootstrap/Archive.java

Between the two it gets the job done.

This is beta quality at the moment, but looking great.  There seems to be an 
issue getting the tomee-micorprofile, tomee-plus and tomee-plume version to 
work, however tomee-webprofile appears to be operating flawlessly.

I backported this to TomEE 7.0.x and there all distributions work perfectly.  
There seems to be something with our MicroProfile support that causes a CDI 
conflict on load.  I'll post about this separately.

If you managed to make it to the end of this, let me know.  :)

I suspect there's so much here people won't even know where to start with 
questions and likely I'll nothing from anyone.  I'm truly happy to hear any 
feedback at all.  It always feels great to hear people's thoughts and have 
discussion after a big coding bender.

Though I've documented it and tried to make it look polished, none of this is 
set in stone and there is ample room for rewrites, different ideas and big 
changes.


-David

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to