Hi everyone,

Unfortunately, I can't do a tl;dr this time, as this matter requires a
lot of context.

This email will take you < 20 minutes to read, according to
https://thereadtime.com/.

As you may have followed on a separate thread
(https://lists.apache.org/thread/nknm6j641qk2c7cl621tsy3fy98tsc69),
many of us were working towards removing a circular dependency
currently present between `kogito-apps` and `kie-tools`. As we
progressed towards a solution, we kept finding the circular dependency
pop up somewhere else. I'll do a breakdown of the things we did, and
the results we had.

Right now, even though we started the effort to move the Quarkus Dev
UI modules to `kie-tools`, we haven't been able to do it yet, as we've
been busy upgrading KIE Tools to Java 17, Maven 3.9.6, and Quarkus
3.2.9, compatible with Kogito Runtimes 999-20240218-SNAPSHOT. This
effort was concluded this Monday, with
https://github.com/apache/incubator-kie-tools/pull/2136.

The current scenario we have is:

                01. incubator-kie-kogito-runtimes
        |==> 02. incubator-kie-kogito-apps
   C   |       03. incubator-kie-kogito-examples
   Y    |       04. incubator-kie-kogito-images
   C   |        05. incubator-kie-kogito-serverless-operator
   L    |       ==========================
   E    |       06. incubator-kie-sandbox-quarkus-accelerator
        |==> 07. incubator-kie-tools


        * As `kie-tools`/extended-services depends on `kogito-apps`/jitexecutor;
        * and `kogito-apps`/{sonataflow,bpmn}-quarkus-devui depend on
`kie-tools`/{many packages}


After moving the Quarkus Dev UIs to `kie-tools`, we would've had:

                01. incubator-kie-kogito-runtimes
                02. incubator-kie-kogito-apps
                03. incubator-kie-kogito-examples
    C   |==> 04. incubator-kie-kogito-images
    Y   |       05. incubator-kie-kogito-serverless-operator
    C   |       =====================
    L   |       06. incubator-kie-sandbox-quarkus-accelerator
    E   |==> 07. incubator-kie-tools

        * As `kie-tools`/kn-plugin-workflow depends on
`kogito-images`/kogito-swf-devmode;
        * and `kogito-images`/kogito-swf-devmode depends on
`kie-tools`/sonataflow-quarkus-devui


After moving the `kogito-swf-devmode` image to `kie-tools`, we would've had:

                01. incubator-kie-kogito-runtimes
                02. incubator-kie-kogito-apps
                03. incubator-kie-kogito-examples
                04. incubator-kie-kogito-images
    C   |==> 05. incubator-kie-kogito-serverless-operator
    Y   |       =====================
    C   |       06. incubator-kie-sandbox-quarkus-accelerator
    L   |==> 07. incubator-kie-tools
    E

        * As `kie-tools`/kn-plugin-workflow depends on
`kogito-serverless-operator`;
        * and `kogito-serverless-operator` depends on
`kie-tools`/kogito-swf-devmode


Clearly, we have a much bigger problem than a simple circular dependency.

After multiple conversations with a lot of people, it's been really
hard coming up with a simple solution that makes it possible to build
Apache KIE in one shot, while preserving the way everyone is used to
contributing to the multiple repositories we have. More than that,
while making this assessment, I found more problems that, in my
perspective, block Apache KIE 10.

In light of that difficulty, I'm coming forward with my proposal for
the Apache KIE release process, so we can use Apache's mechanisms to
have a slower-paced, in-depth debate about this really complicated
matter.

I'll lay out my entire perspective about the current situation of our
codebase, as well as problems I can currently see. I'll start with an
analysis of the repositories and their purposes, point out some
problems that I believe are blocking our 10 release, explain my
proposal and discuss some consequences to what I'm proposing.

Let's begin.


# THE APACHE KIE REPOS

A. DROOLS OPTAPLANNER, & KOGITO (count: 11)
- incubator-kie-kogito-pipelines @ `main`
- incubator-kie-drools @ `main`
- incubator-kie-optaplanner @ `main`
- incubator-kie-optaplanner-quickstarts @ `main`
- incubator-kie-kogito-runtimes @ `main`
- incubator-kie-kogito-apps @ `main`
- incubator-kie-kogito-examples @ `main`
- incubator-kie-kogito-images @ `main`
- incubator-kie-kogito-serverless-operator @ `main`
- incubator-kie-kogito-docs @ `main`
- incubator-kie-docs @ `main-kogito`

B. TOOLS (count: 2)
- incubator-kie-sandbox-quarkus-accelerator @ `0.0.0`
- incubator-kie-tools @ `main`

C. BENCHMARKS (count: 2)
- incubator-kie-kogito-benchmarks @ `main`
- incubator-kie-benchmarks @ `main`

D. ARCHIVED (count: 1)
- incubator-kie-kogito-operator

E. "NON-CODE" (count: 5)
- incubator-kie-issues @ `main`
    (Issues only, README should be updated @ `main`. Same for GitHub
Actions workflows.)
- incubator-kie-kogito-website @ `main`
    (The Kogito website. Develop & deploy at the `main` branch.)
- incubator-kie-website @ `main`
    (The KIE website. Develop @ `main`. Push @ `deploy` to update the website.)
- incubator-kie-kogito-online @ `gh-pages`
    (GitHub pages used to host sandbox.kie.org and KIE Tools' Chrome
Extension assets.)
- incubator-kie-kogito-online-staging @ `main`
    (Same as above, but for manual sanity checks during the staging
phase of a release.)

TOTAL (count: 21)

I grouped the repositories by category, and listed them in a
topological order. Keep in mind that when flattening out a tree, there
are multiple possibilities. For example, OptaPlanner could've been
placed in any position after Drools.

Category A repos are what I've been referring to as `drools` and
`kogito-*` stream. Of course OptaPlanner is inside that stream, as the
way these repositories reference each other are through Maven
SNAPSHOTs. More specifically, the 999-SNAPSHOT version. This mechanism
is well-known to the team, and although flawed for intra-day builds
and disruptive for people in many different time zones, it is already
very comfortable for everyone to work with, I assume.

Contributions made to Category A have some dedicated pipelines, which
are, at least to some extent, able to build cross-repo PRs together
and verify that the codebase will continue working as expected after
they're all merged. From what I could gather, there are some
"sub-streams" currently configured for cross-repo PRs.

- kogito-pipelines
- drools, kogito-runtimes, kogito-apps, and kogito-examples
- optaplanner, and optaplanner-quickstarts
- kogito-images, and kogito-serverless-operator
- kogito-docs
- kie-docs

This means that sending cross-repo PRs to any combination of repos
that are not part of the same "sub-stream" cannot be verified before
merging, making our contribution model dependent on individual
contributors building stuff on their machines to verify that it works.

I based this analysis on
https://github.com/apache/incubator-kie-kogito-pipelines/blob/main/.ci/project-dependencies.yaml,
https://github.com/apache/incubator-kie-optaplanner/blob/main/.ci/buildchain-project-dependencies.yaml,
and 
https://github.com/apache/incubator-kie-kogito-pipelines/blob/main/.ci/jenkins/config/branch.yaml.
Note that I'm not that familiar with these pipelines, so please
someone correct me if I'm wrong.

Category B repos are what I've been referring to as `kie-tools`
stream. The first repo there is a template repository that is used by
people starting projects from scratch on KIE Sandbox, similar to a
Maven archetype, if you will. The other one is the KIE Tools monorepo,
a polyglot monorepo with `pnpm` as its build system. Currently, KIE
Tools hosts Java libraries and apps, TypeScript libraries and apps, Go
apps, Docker images, and Helm charts. The `kie-tools` monorepo is
configured to work with sparse checkouts and can do partial builds.
Category B repos refer to Category A repos through timestamped
SNAPSHOTs. This is a new mechanism we recently introduced that will
build and publish immutable, persistent artifacts under a version
following the 999-YYYYMMDD-SNAPSHOT format, published weekly every
Sunday night. Timestamped SNAPSHOTs are an evolution to the Kogito
releases, as we're now targeting one release for all of Apache KIE, so
we can't have Kogito releases anymore.

An important note here is that Category B repositories have been
historically kept out of any automations we used to have, way back
when Kogito started and we had the Business Central (a.k.a. v7) stream
still going on. For this reason, Category B projects have developed
their own automations, based on GitHub Actions. Category B repos have
always depended on Category A repos using fixed versions. If Category
B repos have had adopted mutable SNAPSHOTs, breaking changes on
Category A repositories would've had the potential to break Category B
silently, leaving Category B with a broken development stream, and
introducing unpleasant surprises for maintainers of Category B repos,
as historically Category A contributors were not familiar with
Category B repos.

Contributions made to Category B repos go through a GitHub Actions
workflow that builds the relevant part of the `kie-tools` monorepo for
the changes introduced. Changes made to the pipeline itself are also
picked up as part of PRs, allowing us to do things like atomically
bumping the Node.js version, for example. More importantly, it allows
us to upgrade the repository to a new timestamped SNAPSHOT together
with the changes necessary to make it stay green.

This setup, however, makes it impossible to have cross-repo PRs
involving Category A and Category B simultaneously, with the current
automations we have.

Category C repos are kind of floating around, and I'm not sure if
there's much activity going on there. Regardless, as they're part of
Apache KIE, they will be part of our release, so I listed them for us
to take them into consideration too.

Category D is self explanatory. There's only one repo that has already
been marked for being archived.

Category E are repos that do not host code directly, and are either
organizational entities, or host websites, that currently are not part
of any pipelines we have.

This lack of unification between Category A and Category B is, IMHO,
what allowed us to introduce the infamous circular dependency between
`kie-tools` and `kogito-apps`, which we now can describe as a circular
dependency between Category A and Category B. The way I see it, if we
had a single pipeline, building everything from `drools` to
`kie-tools`, such flaws would've never been introduced, and we
wouldn't be having this huge problem in our hands right now.

My proposal for the Apache KIE release process sees this lack of
unification as a central problem, not only for this release in
particular, but for the community as a whole. It is my belief that we
are all under the same roof, and that no contribution should be
allowed to break any part of our codebase. With the increasing volume
of code, and hopefully number of contributors too, we cannot keep
counting on "common sense" to avoid breaking things. We're all humans
after all, and it is our job to have mechanisms in place to prevent us
from unwillingly making mistakes. Especially when these mistakes
impact on parts of the codebase that we, individually, probably can't
fix.


# THE PROBLEMS WE HAVE RIGHT NOW

P1. Quarkus Dev UIs @ `kogito-apps` depending on kiegroup's KIE Tools `0.32.0`.
See:
- 
https://github.com/search?q=repo%3Akiegroup%2Fkogito-apps+path%3Apackage.json+kie-tools&type=code


P2. PR open for Kogito SWF images @ `kogito-images` depending on
kiegroup's KIE Tools `0.32.0`.
See:
- 
https://github.com/apache/incubator-kie-tools/tree/main/packages/sonataflow-deployment-webapp


P3. DashBuilder @ `kie-tools` depending on kiegroup's `lienzo` and
`kie-soup` artifacts at version `7.59.0.Final`.
See:
- 
https://github.com/apache/incubator-kie-tools/blob/main/packages/dashbuilder/pom.xml#L64
- 
https://github.com/search?q=repo%3Aapache%2Fincubator-kie-tools+path%3Apackages%2Fdashbuilder+%24%7Bversion.org.kie%7D&type=code


P4. Multiple packages @ `kogito-apps` depending on kiegroup's
Explainability `1.22.1.Final`.
* This module was removed from the KIE codebase here:
https://github.com/apache/incubator-kie-kogito-apps/commit/bbb22c06d37e77b97aae6496d74abe43a8cfc965
and now lives on
https://github.com/trustyai-explainability/trustyai-explainability,
under a different GAV.
* This new repo depends on Kogito and OptaPlanner, pointing to older versions.
See:
- 
https://github.com/search?q=repo%3Aapache%2Fincubator-kie-kogito-apps+%3Eexplainability-core%3C&type=code
- 
https://github.com/trustyai-explainability/trustyai-explainability/blob/main/pom.xml#L52-L53


P5. `incubator-kie-sandbox-quarkus-accelerator` depending on Kogito
`1.32.0.Final` and Quarkus `2.15.3.Final`.
See:
- 
https://github.com/apache/incubator-kie-sandbox-quarkus-accelerator/blob/0.0.0/pom.xml#L32-L38


P6. Category C repos are out of date and not part of the Category A
CI/Release pipelines.
* incubator-kie-kogito-benchmarks: (Current version is `2.0-SNAPSHOT`,
depending on Kogito without a specific version, only by using
`http://localhost:8080`)
* incubator-kie-benchmarks: (Current version is `1.0-SNAPSHOT`,
pointing to Drools 999-SNAPSHOT and OptaPlanner `8.45.0-SNAPSHOT`)


P7. `kie-tools`/packages/kn-plugin-workflow has its E2E disabled after
upgrading to 999-20240218-SNAPSHOT.


In my perspective, P1 and P2 have the same solution, as they both
suffer from the circular dependency between Category A and Category B.
As Category A and Category B are both streams that have been really
active, I see this as a blocker, as there are contributions that
cannot be done, given that Category A depends on Category B with a
dephasing of 1 release.

P3 and P4, although not ideal, can be understood as technical debt.
Depending on unmaintained projects is something we'll always be
susceptible to, given time.

P5 and P6 are easily fixable, as it's just a matter of making them
part of the play.

P7 is an isolated problem that won't impact the structure or anything
that we're talking about here, but it is a regression we introduced
recently.

Assuming P3 and P4 can be ignored for Apache KIE 10, and that P5, P6,
and P7 have easy fixes, the only problems left to discuss are P1 and
P2, which can't be done without a proper proposal.


# THE PROPOSAL

I'll try to be very meticulous here, since from my experience, any
little miscalculation can lead to our release not working out in the
end. To try and avoid that as much as possible, and make everything we
can to have a successful Apache KIE 10 release, bear with me. I'll lay
out a timeline of events that need to happen in order for our release
to be published, with all artifacts ending up in the right places, but
first, we need to solve problems P1 and P2.

As you saw at the beginning of this email, all the attempts we made
left us with the circular dependency showing up at a different place,
but something all these places have in common is that they're all
after kogito-apps, and before to Category B.

The first part of my proposal is the following:

S1. We keep the original plan of moving the Quarkus Dev UIs from
`kogito-apps` to `kie-tools`, together with Management and Task
consoles from `kogito-images` to `kie-tools`.
S2. We move the `kogito-swf-devmode` and `kogito-swf-builder` images
from `kogito-images` to `kie-tools` too.
S3. We move the entire `kogito-serverless-operator` repo inside a new
package on `kie-tools`, keeping Git history.

Solutions S1, S2, and S3 together solve problems P1 and P2. Of course
the rest of https://github.com/apache/incubator-kie-issues/issues/967
would still be done too.

This doesn't come without consequences, of course, as the
`kogito-swf-devmode` and `kogito-swf-builder` images, and the
`kogito-serverless-operator` would be moving from Category A to
Category B. This move would make them have to reference Category A
repos through timestamped SNAPSHOTs. Since `kogito-images` and
`kogito-serverless-operator` are already their own "sub-stream" inside
Category A, though, contributions made in a cross-repo fashion to this
"sub-stream" will continue being possible, now via a single PR to
`kie-tools`. Cross-repo PRs between Category A and Category B will
continue not being possible, and a 1-week delay between merging
something on Category A and using it on Category B will still happen.

It's worth mentioning that `kie-tools`, however, does allow for sparse
checkouts and partial builds, so working with a subset of the monorepo
is possible and encouraged. Making changes only to
`packages/kn-plugin-workflow`, for example, will have the PR checks
run in < 10 minutes, as you can see here:
https://github.com/apache/incubator-kie-tools/actions/runs/8237244382/job/22525511722?pr=2136.
We're not compromising when running partial builds too. We know that
the entire repo will continue working even after only building a small
subset of the changes. Doing partial or full builds is automatically
determined by the changes of a PR.

Keep in mind that, even though I'm proposing we move a bunch of
additional stuff into `kie-tools`, I see this as a TEMPORARY solution
for our codebase. `kie-tools` would host some additional stuff
TEMPORARILY so that we can release and continue moving forward.

As I mentioned on other places, `kie-tools` became a polyglot monorepo
out of necessity, and although I'm really proud of what we achieved
there so far, I don't think `kie-tools` has a setup that is suitable
for all the different nuances that compose our community. I'm well
aware that a polyglot monorepo that does not follow widespread
conventions will scare some people away, and as much as we've tried to
make build instructions clear, we can't always get past the prejudice
some people have towards the "front-end" ecosystem.

With all that said, I keep thinking this is the best course of action
for us right now. We keep most of our stuff unchanged, we unblock the
release, and we have a working setup that will suit us well while we
discuss and reach a conclusion regarding the future of our codebase
structure.

Let me paint a quick picture here of what our code base would look
like, repository-wise, if my proposal is accepted:

CATEGORY    REPO
=====================
A           incubator-kie-kogito-pipelines
A           incubator-kie-drools
A           incubator-kie-optaplanner
A           incubator-kie-optaplanner-quickstarts
A           incubator-kie-kogito-runtimes
A           incubator-kie-kogito-apps
A           incubator-kie-kogito-examples
A           incubator-kie-kogito-images
A           incubator-kie-kogito-docs
A           incubator-kie-kogito-benchmarks
A           incubator-kie-docs
A           incubator-kie-benchmarks
=====================
B           incubator-kie-sandbox-quarkus-accelerator
B           incubator-kie-tools
=====================
D           incubator-kie-kogito-operator
=====================
E           incubator-kie-issues
E           incubator-kie-kogito-website
E           incubator-kie-website
E           incubator-kie-kogito-online
E           incubator-kie-kogito-online-staging
=====================

* Category C becomes part of Category A, and
`kogito-serverless-operator` moves entirely inside `kie-tools`.
* With `kogito-swf-{builder,devmode}` images and
`kogito-serverless-operator` inside `kie-tools`, there are no cycles
anymore, as inside `kie-tools`, we can granularly build:
    1. packages/sonataflow-deployment-webapp
    2. packages/sonataflow-quarkus-devui
    3. packages/sonataflow-images (containing `kogito-swf-builder` and
`kogito-swf-devmode`)
    4. packages/sonataflow-operator (contents from `kogito-serverless-operator`)
    5. packages/kn-plugin-sonataflow (`packages/kn-plugin-workflow`,
but renamed)

The second part of the proposal is the release process itself,
assuming the structure above is what we have.

Here it is:

1. Define a timestamped SNAPSHOT to be used as cutting point for
Category A repos.
2. Update Category B repos to point to this timestamped SNAPSHOT, and
verify that everything is working.
3. At this point, with everything working, we can branch out to
`10.0.x`. Category A from the timestamped SNAPSHOT tag, and Category B
from `main`.
4. All Category A and Category B repos update their versions to
10.0.0, in their `10.0.x` branches.
5. Update Category B repos to point to Category A repos using the
10.0.0 version.
6. At this point, we can vote on the release based on the `10.0.x`
branches, given we don't expect any code changes anymore.
7. After voting passes, we're good to start the release process.
8. Category A repos follow their manual/automated release process,
pointing to the `10.0.x` branch. Tags pushed to Git, and built
artifacts pushed to their registries.
9. We wait a little bit for Category A artifacts to be propagated on
registries. ~1 day.
10. Category B repos follow their manual/automated release process,
pointing to the `10.0.x` branch. Tags pushed to Git, and built
artifacts pushed to their registries.
11. Category D repos are ignored.
12. Category E repos can be manually tagged with 10.0.0 from their
default branches.

More needs to be discussed if we're planning to maintain multiple
release streams in parallel, but I guess it can wait for after Apache
KIE 10.

Thank you for reading, and I'm looking forward to hearing back from everyone.

Of course, alternative solutions are possible. This email, however,
summarizes my view of how we should attack the problem, considering
disruption, required effort, the release process itself, and history.
Feel free to propose alternatives. This is not a voting thread.

Regards,

Tiago Bento

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to