Hello All,
I have run into a dilemma and would like to know what our policy is to deal with the following situation. As part of the implementation for Kudu Input operator (https://issues.apache.org/jira/browse/APEXMALHAR-2472 <https://issues.apache.org/jira/browse/APEXMALHAR-2472>) , I will be using Antlr4 as the parser tool to parse a line of string as an SQL equivalent statement to represent the set of tuples that will be streamed out of Kudu store to the downstream operators. I will post the design once a few things are finalised in a separate mailing thread and this mail is more about Checkstyle and Auto generated code from tools like Antlr4. The design involves in writing a grammar file and let the maven tool generate the parser and related code as .java files as part of the build process. We only keep maintaining the grammar “.g4” file as part of the repository checkins as Kudu functionality evolves. However this brings me to the situation wherein the check style fails for the classes that are autogenerated. Following are the three options that I think we have and would like to get thoughts on what is the best way to go forward. Option 1: We let the autogenerated code generate code in the "target/generated-sources” path. This is the default for the maven antler plugin. This however does not pass check style maven plugin as check style plugin does check styles for auto-generated code as well. The fix for this is to modify check style plugin to only look at “src/“ folder paths as opposed “compiled sources”. This works from a build perspective but the drawback is that IDEs will not include the “target/generated-sources” for class resolution. IDEs do have plugins to resolve this error code but might be considered irksome by the developer community. Option 2: We let Antlr4 code-gen to generate code in the Kudu package path and of course checkstlye would fail this as well. The fix is to let Checktyle include a “excludes” pattern and make check style ignore all java files that represent a pattern of files generated by the Antlr4 code-gen tool. There is still an issue that remains to be resolved even if this approach is approved by the community. The issue is the tool generates a couple of “.token” files that are always placed in the root class path and not under the package structure which will pollute the sanity a bit. I am still working on this bit as this needs to be resolved. Option 3: Perhaps the ideal is to let a separate module for kudu from the top level to resolve all of the issues ideally ( i.e. token files are generated in the kudu module root along with the java sources in the correct package structure ) and I guess that is a separate discussion that Thomas/Vlad and others are planning to take up as a separate thread in the mailing list. Could you please let me what you think is the ideal path to pursue ( or if there are other alternatives for the use case above ) Regards, Ananth