Hi everyone, thanks for starting this discussion Niels.
I like the idea of getting rid of parsing a Properties instance. On the other hand, I also understand that people are concerned about disrupting the workflows of our devs. Maybe we can find a compromise between both approaches. For example, one could define an EnvironmentInformationProvider interface which is loaded at runtime. The implementation of this interface could be generated by `mvn generate-sources`. Moreover, if we cannot find it, then we fail with a descriptive exception saying to run `mvn generate-sources`. That way only devs who rely on the EnvironmentInformation have to run `mvn generate-sources` when testing/developing and we could still get rid of all the parsing and logic of a properties file. What do you think? Cheers, Till On Mon, Apr 13, 2020 at 10:53 AM Niels Basjes <ni...@basjes.nl> wrote: > Hi, > > On Mon, Apr 13, 2020 at 8:30 AM Yang Wang <danrtsey...@gmail.com> wrote: > > > Although the latter option is more stable, > > i think it is not acceptable for all the developers to execute `mvn > > generate-sources` first. > > Otherwise, the Flink project is just broken and could not run tests, Flink > > jobs in IDE. > > > > > It is important to realize that this discussion is essentially around the > fact that systems like IntelliJ (what I use) execute various ways of > generating code that have been written in a maven pom.xml in different > ways. Simply put: They don't execute all instructions that have been > defined in the pom.xml automatically when needed. > > - With the properties file variant (which uses resource filtering) > IntelliJ does resource filtering automatically yet no maven plugins are > run that affect the properties that are replaced. > So the git variables are NOT populated when running in IntelliJ and I > have seen several values in the propersites file are nonsense. > As a consequence the Java code reading those needs to catch things like > "missing properties file" and various kinds of "nonsense values" and > replace them with a workable default. > So when running this code you are actually running something different > from what will be run when doing a `mvn clean install`, yet the > developer > is under the illusion it all works because of the code that catches all > of > those problems. > Depending on the way these variables are used this may lead to "It fails > in production but it works fine in IntelliJ" situations. > In my mind the code that catches all of those exceptional situations in > poorly generated properties only exists so that developers can run the > (otherwise perfectly fine) software in a "broken" build/development > environment. > > > - With the way I propose to generate the code the effect is indeed > slightly harder: If you do not run the pom.xml based code generation it > will not work. > I understand that this requires the developers to think a bit more about > the code they are working on. > The upside is that the code either works perfectly or does not compile. > There is no "it compiles but is really nonsense". > I prefer this. > Also at this point in time this is already true for Flink: There are > quite a few modules where the developer MUST run mvn generate-sources > for > all tests to run successfully. > Like I indicated there is a javacc SQL parser and there are a lot of > tests that rely on generating Avro code before they are able to run. > I have several projects where systems like Avro and Antlr force me in > this direction, there is simply no other way to do this. > > So i think the version properties file is enough for now. +1 for the first > > option. > > > > Like I said. I'm fine with either choice by the committers. > I would choose the `mvn generate-sources` code variant as it is much > simpler and much more reliable. > > Niels > > Niels Basjes <ni...@basjes.nl> 于2020年4月9日周四 下午4:47写道: > > > > > Hi, > > > > > > I'm working on https://issues.apache.org/jira/browse/FLINK-16871 to > make > > > more build time variables (like the scala version) into the code > > available > > > at runtime. > > > > > > During the review process there was discussion around a basic question: > > *Is > > > generating java code during the build ok?* > > > See > > > > > > - https://github.com/apache/flink/pull/11245#discussion_r400035133 > > > - https://github.com/apache/flink/pull/11592 > > > - https://github.com/apache/flink/pull/11592#issuecomment-610282450 > > > > > > As suggested by Chesnay Schepler I'm putting this question to the > mailing > > > list. > > > https://github.com/apache/flink/pull/11592#issuecomment-610963947 > > > > > > The main discussion was around the ease of use when running in an IDE > > like > > > IntelliJ. > > > > > > So essentially we have two solution directions available: > > > > > > 1. *Generate a properties file and then use the classloader to load > > this > > > file as a resource and then parse it as a property file.* > > > This is the currently used solution direction for this part of the > > code. > > > A rough version of this (to be improved) : > > > > > > > > > https://github.com/apache/flink/commit/47099f663b7644056e9d87b262cd4dba034f513e > > > This method has several effects: > > > 1. The developer can run the project immediately from within the > > IDE > > > as fallback values are provided if the 'actual' values are > missing. > > > 2. This property file (with stuff that should never be > overwritten) > > > can be modified by placing a different one in the classpath. In > > > fact it IS > > > modified in the flink-dist as it generates a new file with the > same > > > name > > > into the binary distribution (I consider this to be bad). > > > 3. Loading resources means loading, parsing and a lot of error > > > handling. Lots of things "can be null" or be a default value. So > > the > > > values are unreliable and lots of code needs to handle this. In > > fact > > > when > > > running from IntelliJ the properties file is generated poorly > most > > > of the > > > time, only during a normal maven build will it work correctly. > > > 2. *Generate a Java source file and then simply compile this and > make > > it > > > part of the project.* > > > Essentially the same model as you would have when using Apache Avro, > > > Protobuf, Antlr 4 and javacc (several of those are used in Flink!). > > > A rough version of this (to be improved) : > > > > > > > > > https://github.com/apache/flink/commit/d215e4df60dc9d647dcee1aa9a2114cbf49d0566 > > > This method has several effects: > > > 1. The developer MUST run 'mvn generate-sources' before the actual > the > > > project immediately from within the IDE as fallback values are > > > provided if > > > the 'actual' values are missing. > > > 2. The code/test will not run until this step is done. > > > 3. Because the file is generated by a plugin it is always > correct. > > As > > > a consequence all variables are always available and the > downstream > > > users > > > no longer have to handle the "can be null" or "default value" > > > situations. > > > > > > So is generating code similar to what I created a desired change? > > > My opinion is that it is the better solution, the data available is > more > > > reliable and as a consequence the rest of the code is simpler. > > > It would probably mean that during development of flink you should be > > aware > > > of this and do an 'extra step' to get it running. > > > > > > What do you guys think? > > > > > > -- > > > Best regards / Met vriendelijke groeten, > > > > > > Niels Basjes > > > > > > > > -- > Best regards / Met vriendelijke groeten, > > Niels Basjes >