A "Kamel" crazy idea

Nicola Ferraro Thu, 12 Jul 2018 16:30:33 -0700

Hi Cameleers,
it's now passed some time since I started thinking about a new project that
we can begin here at Apache Camel, and I'd like to have your opinion.


We've already been targeting cloud-native applications with Camel,
especially on top of Kubernetes, that is becoming "the standard" cloud
platform. But writing a Camel integration and running it on Kubernetes
requires some effort: choosing the base platform (spring-boot, karaf,
simple main?), adding health checks (actuator?), packaging a docker image
and creating the Kubernetes resources (fabric8-maven-plugin, helm?),
publishing the image on a docker registry, then finally deploying the
resources on a Kubernetes cluster.

The resulting integration container is then far from being optimal from a
resource consumption point of view: it is likely that a Camel Spring-Boot
application will require at least 200MB of RAM and also some CPU shares
because of polling threads used by many components.

In case people use a CI/CD pipeline, it will take also a long time to get
from a code update to having a Kubernetes POD up and running.
Apart from compilation and image push/pull time, also startup time is often
~10 seconds for Camel + Spring-Boot in a container with standard limits on
resources, making it difficult to propose this combination for "serverless
integration" (this term is becoming increasingly more popular).

So, my proposal is to start to investigate a "more cloud-native" approach
to integration: *making Camel integrations first-class citizens in
Kubernetes, and making them super fast and lightweight.*

We can base the project on Kubernetes Custom Resource Definitions (CRD)
<https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/>,
for example a Integration CRD and have a Kubernetes "operator"
<https://coreos.com/operators/> taking care of:
- Optimizing the integration that we want to run
- Packaging in a container
- Running it on Kubernetes
- Managing its entire lifecycle

A Kubernetes-native integration may look like:

-------------------
kind: "Integration"
apiVersion: "camel.apache.org/v1alpha1"
metadata:
 name: "example"
spec:
 replicas: 1
 routes:
  - id: timer
    route:
     - type: endpoint
       uri: timer:tick
     - type: endpoint
       uri: log:info
-------------------

For those who are not familiar with Kubernetes resources, this kind of
YAML/JSON resource definitions are really common.
The example route is embedded in the Kubernetes resource declaration and
follows a basic "flow DSL". We may start from a basic one and evolve it as
new requirements arrive from the community.

I've made a very simple (but working) POC here:
https://github.com/nicolaferraro/integration-operator.

This idea of a "Cloud-Native Camel" on Kubernetes (project codename can be "
*Kamel*", if you like it :D), will be an enabler for a lot of nice features.

For example, we can propose "Kamel" as "ideal" platform for "serverless
integration" (I see many people reinventing the wheel out there): the
operator can reduce resource consumption of a single integration by
optimizing the runtime and also pause/resume integrations when they are not
used, that is the basic idea behind "serverless" (e.g. think to
HTTP-triggered integrations, but not only).
Focusing on serverless will bring more emphasis on push-based notifications
(webhooks, cloud events <https://cloudevents.io/>), that are rarely used in
Camel components, that prefer a poll based approach being it simpler to use
in classic deployments, but not so good in the cloud, where more resources
become higher direct costs for the users.

The presence of the simplified DSL enables also experimenting on "*reduced*
subsets of Camel" implemented in languages other than Java, for example one
language that has a reactive approach on thread scheduling and a really low
memory footprint, like Go.

But apart from this kind of experiments (that are valid IMO), the "Kamel"
optimizer will have free room to choose the right platform for the
integration that the user wants to run, including, in the future, doing AOT
compilation using Graal/VM (less memory, faster startup) if the features
(components) used in the integration are supporting it (maybe we can add
AOT compilation in the roadmap for Camel 3).
A silly optimization: integrations starting from "timer:..." may be
scheduled directly with Kubernetes CronJobs, so they will consume resources
only when actually running.

Being the final integrations lightweight and being the DSL
language-independent, we may see a increased adoption of Camel also as
agile integration layer for not-only-java applications (both "cloud" and
"serverless" applications).

I'm the first one that would like to work on a project ilke this. I've
worked on many Kubernetes/Openshift based applications and frameworks in
the past years, also on operators and CRDs, and I think this way of
redesigning integrations has a lot of potential.

Integrations will not be necessarily limited to the simplified DSL, but we
can add extension points for scripting and even custom libraries (although
limiting the freedom of the optimizer).

The most important thing: it may become a great project, since it's driven
by a great community.

So, what do you think? Is it crazy enough?

Nicola

A "Kamel" crazy idea

Reply via email to