And check out HADOOP-12145 <https://issues.apache.org/jira/browse/HADOOP-12145> Organize and update CodeReviewChecklist wiki.
Thanks, your contribution will be greatly appreciated! On Mon, Aug 7, 2017 at 5:53 AM, Steve Loughran <ste...@hortonworks.com> wrote: > > Hi Lars & Welcome! > > Maybe the first step here would be look at those style guides and think > how to bring them up to date, especially with stuff like lambda-expressions > in java 8, and mnodules forthcoming in in java 9, SLF4J logging, Junit 5 -> > 5 testing, code instrumentation, diagnostics, log stability, etc. > > https://issues.apache.org/jira/browse/HADOOP-12143 . ; > > This is my go at doing this > > https://github.com/steveloughran/formality/blob/ > master/styleguide/styleguide.md > > > I've not done any work on trying to get it in, more evolving it as how I > code & what I look for, especially in tests. > > If you want to take this on, it'd be nice. At the same time, I fear > there'd be push back if you turned up and started telling people what to > do. Collaborating with us all on the test code is a good place to start. > > We're also more relaxed about contributions to the less-core bits of the > system (things like HDFS, IPC, security and Yarn core are trouble). If > there's stuff outside that you want to take a go at helping clean up, > that'd be lower risk (example: object store connectors) > > -Steve > > > > On 7 Aug 2017, at 13:13, Lars Francke <lars.fran...@gmail.com<mailto: > lars.fran...@gmail.com>> wrote: > > Hi, > > a few words about me: I've contributed to Hadoop (and it's ecosystem[4]) in > the past am a Hive committer and have used Hadoop for 10 years now, so I'm > not totally inexperienced. I'm earning my money as a Hadoop consultant so > I've seen dozens of real-life clusters in my life. > > As part of a few recent client projects and now writing about Hadoop in a > new project/book I'm digging into the source code to figure out some of the > things that are not documented. > > But as part of this digging I'm seeing lots of warnings in the code, > inconsistencies etc. and I'd like to contribute some fixes to this back to > the community. > > I have been a long-time believer in good code quality and consistent code > styles. This might affect people like me especially who do a lot of > "drive-by" contributions as I'm not someone who looks at the code daily but > comes across it reasonably often as part of client engagements. In those > scenarios, it's very unhelpful to have inconsistent code & bad > documentation. > > Two simple but concrete examples: > * There's lots of "final" usages on variables and methods but no > consistency. Was this done for particular reasons or personal preference? > > personal, though with a move to l-expressions, it matters a lot more. We > should really be marking all parameters as final at the very least. > > > * Similarly, there's lots of things that are public or protected while they > could in theory be private. This especially makes it very hard to reason > about code. > > there's now a bit of fear of breaking things, but at the very least, > things could be protected or package-private more than they are. > > > > Judging from the current code there's lots of "unofficial" code styling > and/or personal preference. The Wiki says[1] to follow the Sun > guidelines[2] which have not been updated in almost 20 years. A new version > is in the works an clarifies a lot of things[3]. I'm trying to get it > published soon. I'd try to format according to the latter (that means among > other things no "final" for local variables). > > I realize that I won't be able to single-handedly fix all of this > especially as code gets contributed but if the community thinks it's > worthwhile I'd still love to land a few cleanup patches. My experience in > the past has been that it's hard to get attention to these things (which I > fully understand as they take up someone's time to review & commit). > > So, this is my request for comments on these questions: > * Is there any interest in this at all? > ** "This" being patches for code style & things like FindBugs & Checkstyle > warnings > * Size of the patches: Rather one big patch or smaller ones (e.g. per file > or package) > * Anyone willing to help me with this? e.g. reviewing and committing? I'd > be more than happy to bribe you with drinks, sweets, food or something else > > My plan is not to go through each and every file and fix every issue I see. > But there are some specific areas I'm looking at in detail and there I'd > love to contribute back. > > Thank you for reading! > > Cheers, > Lars > > PS: Posting to common-dev only, not sure if I should cross post to hdfs-dev > and yarn-dev as well? > > [1] <https://wiki.apache.org/hadoop/CodeReviewChecklist> > [2] < > http://www.oracle.com/technetwork/java/javase/documentation/codeconvtoc- > 136057.html > > [3] <http://cr.openjdk.java.net/~alundblad/styleguide/index-v6.html> > [4] < > https://issues.apache.org/jira/issues/?filter=-1&jql= > reporter%20%3D%20lars_francke%20OR%20assignee%20%3D%20lars_francke > > > -- John