Hello everyone, I would like to bring up an issue with Pulsar's containers, specifically regarding the method of overriding configurations. For instance, the Apache Pulsar Helm chart employs "bin/apply-config-from-env.py conf/broker.conf" and "bin/gen-yml-from-env.py conf/functions_worker.yml" [1] to apply configurations passed in the environment to the configuration files in the container's root file system. This approach fails when the container's root file system is read-only due to strict security policies (`readOnlyRootFilesystem` in `securityContext`). This issue has been reported as #22088 [2].
A temporary fix could involve using a temporary file to modify the configuration file when the filesystem is read-only. However, the Python script solution is not ideal, and we should consider eliminating it. In the long term, it would also be beneficial to remove the need for a shell script to start Pulsar, but that's a separate issue. For configuration handling, we need a solution that can apply overrides in memory, eliminating the need to modify on-disk files. Modern configuration frameworks can do this out-of-the-box. Currently, Pulsar uses a homegrown configuration framework. Instead of extending this framework, I propose we discuss replacing it with the Gestalt Config library [3]. This library, licensed under Apache-2.0, is a mature, well-established solution for configuration handling. Switching to Gestalt Config would allow us to move towards a more structured and modular configuration in Pulsar. Our current configuration is not modular, as it relies on a "god object" for configuration, which collects all possible configuration options. Gestalt Config offers modular usage patterns similar to those of Spring Boot's external configuration [4] and the MicroProfile Config [5] in Quarkus. However, Gestalt Config does not pull in other dependencies, giving it an advantage over Spring Boot and Quarkus configuration solutions. Other libraries in this category include the Typesafe config library [6] from Lightbend with HOCON [7], commonly used in Scala and Akka-based applications. Gestalt Config supports many configuration file formats, including flat properties files, yaml, json, toml, and even hocon. It also offers security features for reading secrets directly from Vault, AWS Secrets Manager, and GCP Secret Manager, without the need to use the file system or environment variables to inject secrets into the application configuration. This could significantly improve Pulsar's security posture. Pulsar's current "homegrown configuration framework" is quite simple, implemented in a few classes with the main logic in PulsarConfigurationLoader [8] and FieldParser [9] classes, called from the PulsarBrokerStarter class [10]. The main question is: should we continue extending Pulsar's homegrown configuration framework, or should we consider adopting a library like Gestalt Config for future configuration use case improvements for modularity, structured configuration, and security? Best regards, Lari References: 1 - https://github.com/apache/pulsar-helm-chart/blob/29ea17b3fceef65160620b9018d0dd0449a168c5/charts/pulsar/templates/broker-statefulset.yaml#L210-L221 2 - https://github.com/apache/pulsar/issues/22088 3 - https://github.com/gestalt-config/gestalt 4 - https://docs.spring.io/spring-boot/docs/current/reference/html/features.html#features.external-config 5 - https://microprofile.io/specifications/microprofile-config/ 6 - https://github.com/lightbend/config 7 - https://github.com/lightbend/config/blob/main/HOCON.md 8 - https://github.com/apache/pulsar/blob/master/pulsar-broker-common/src/main/java/org/apache/pulsar/common/configuration/PulsarConfigurationLoader.java 9 - https://github.com/apache/pulsar/blob/master/pulsar-common/src/main/java/org/apache/pulsar/common/util/FieldParser.java 10 - https://github.com/apache/pulsar/blob/db79096baaa3d7118aa026978a615ddc576f9183/pulsar-broker/src/main/java/org/apache/pulsar/PulsarBrokerStarter.java#L69-L76