Hi -

This is great and I agree with the plan except for one missing discussion.

The work needs to either include how to create documentation in the new 
repository and move it to the pulsar-site repository. (This might be better as 
a new PIP which could also include the go and node clients already in separate 
repositories.)

Best,
Dave

Sent from my iPhone

> On Sep 19, 2022, at 4:25 PM, Matteo Merli <mme...@apache.org> wrote:
> 
> https://github.com/apache/pulsar/issues/17724
> 
> 
> 
> ## Motivation
> 
> Pulsar C++ code base is in the same main repository for the Pulsar project.
> 
> While the decision was the right one at the time, there is a
> considerable overhead
> in keeping the C++ client in its current position.
> 
> ### Issues with the current approach
> 
> The Pulsar repository has grown a lot in size and number of active developers.
> 
> 1. The frequency of changes in various parts of the codebase has increased to 
> a
>    point where the amount of resources dedicated to CI is very significant.
> 
>    Every change in Java code will trigger the CI jobs for the C++
> client and every
>    change in the C++ client will do the same.
> 
>    During a CI job we are building the C++ client multiple times:
>     1. For C++ and Python client tests
>     2. To build Python wheels to be included in the pulsar Docker
> images (for supporting
>        Pulsar functions)
> 
> 2. The release process for Pulsar has become very complex and
> requires building a
>    large number of binaries for C++ and Python clients. This has
> become too much
>    of a burden during the course of a Pulsar release.
> 
> 
> ## Goal
> 
> Decouple the development of C++ and Python client libraries from the 
> development
> of the core components of Pulsar in Java.
> 
> 
> ## Changes
> 
> ### Repositories
> 
> 1. Move the C++ client code to a new repository
> `github.com/apache/pulsar-client-c++`
> 2. Move the Python client code to a new repository
> `github.com/apache/pulsar-client-python`
> 
> The change will be done without losing any history, extracting a
> sub-directory into
> a new Git repository.
> 
> ```
> git filter-repo --subdirectory-filter  pulsar-client-cpp
> ```
> 
> ### Release process
> 
> The release process will be split in multiple parts:
> 
> 1. the main Pulsar release will only contain the Java parts (server
> distribution
>    and Java client library)
> 2. The C++ client will have its own release schedule and versioning
> 3. The Python client will have its own release schedule and versioning
> 
> #### Versioning
> 
> Both C++ and Python clients will continue with their own individual 
> versioning.
> 
> In order to not break anything or cause more confusion, we would need to use
> a new version that is bigger than the current version (2.11.x).
> 
> The suggestion is to start the new releases for both C++ and Python from 
> 3.0.0.
> 
> 
> #### Existing branches
> 
> Existing branches of Pulsar, where the C++ client will still be in the same 
> main
> the repository and will be receiving bug fixes in their current location.
> 
> The different location of the new C++ code will make the cherry-picking 
> process
> slightly more painful in the short term, though it will even out in long term.
> 
> 
> ### Projects dependencies
> 
> #### C++/Python --> Pulsar
> 
> Both C++ and Python unit/integration tests are designed to run against
> a standalone
> instance of Pulsar broker. In the current form, they're using the `master` 
> code
> that is built to run the tests.
> 
> After the split, the unit tests will use a Docker image of Pulsar. We
> can use a few
> different images to test for compatibility
> 1. Latest stable (eg: 2.10.1)
> 2. Nightly (Pulsar Docker image published every day from master branch)
> 
> #### Pulsar --> Python
> 
> To create a Pulsar image, we are now building the Python client wheel
> file and then
> installing it at build time.
> 
> Instead, we are going to include a wheel file for a version of the Python 
> client
> that has been already released.
> 
> #### Python --> C++
> 
> The Python client library is just a wrapper on top of the C++ client.
> Today these
> are built together, with Python wrapper code residing in a
> sub-directory of C++ client
> code, and compiled using the same CMake build script.
> 
> By separating the Python client into a different repository, we are going to
> depend on an already released version of the C++ client.
> 
> 
> #### Automated documentation in the website
> 
> On the Pulsar website we are auto-generating C++ documentation with the 
> Doxygen
> tool and the Python one with Pdoc.
> 
> Instead of just fetching the main repo code, the website build job should be
> also fetching the new repos to run the tooling.
> 
> 
> 
> 
> 
> 
> --
> Matteo Merli
> <mme...@apache.org>

Reply via email to