Hello! Because we are switching to proto3 as a language of GAR format definition, we need to decide, are we going t store generated code in git or not.
Pros of storing generated code: 1. Stability: even if the protoc is changed or a plugin is deprecated we are still having generated and compilable code in the repo; 2. Usability: anyone can go to git and see how the actual code looks like; also users and developers should not care about protoc/buf and can just clone the repo and thats it; 3. CI simplicity: we do not need to incorporate protoc/buf in the building process; Cons of storing generated code: 1. Huge git diffs: in my experience changing a single line in proto may tend to hundreds of lines diff in generated classes; 2. Generated by protoc code is actually unreadable and it does not help a lot in understanding what is going on; 3. Risk of outdated classes: I cannot imagine the way how to check that generated code is up to date. Sources of possible inspiration: 1. https://github.com/apache/spark/blob/master/dev/connect-check-protos.py : an utility in Apache Spark project that checks are the generated code up to date or not. We may try to implement the same for Java/Cpp too. 2. https://github.com/apache/spark/blob/master/dev/connect-gen-protos.sh : an utility in Apache Spark project that re-generate proto classes for PySpark and apply formatting to reduce the git diff. We may try to implement the same for Java/Cpp too. How it is done in Apache Spark itself: 1. proto files are incorporated into Maven build via maven-proto- plugin, so Java classes are not stored in the repo and are generated during the build 2. Python classes are stored in the repo and are generated/updated by request. In CI checking of sync status is called Another options. I had talks with some engineers and as I understood the best solution and an industry standard is to put all the protos in a sepearate repository with generation of classes and put these classes into packages. After that these packages may be used as dependencies. The problem here is that requires to split our monorepo into parts: harder to work with, harder to onboard people, harder to test, etc. Best regards, Sem --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
