Hi everyone, I've summarized discussion so far into proposal doc <https://docs.google.com/document/d/1YbV18mrHujmiLBtadS1WzCVeiI3Lo7W6awWJDA4A98o>, so that it gets easier to comment upon.
Please add your comments. Link explicitly: https://docs.google.com/document/d/1YbV18mrHujmiLBtadS1WzCVeiI3Lo7W6awWJDA4A98o Regards, --Mikhail Have feedback <http://go/migryz-feedback>? On Tue, Jan 8, 2019 at 9:50 AM Michael Luckey <adude3...@gmail.com> wrote: > Currently I d opt for just enabling jacocoTestReport on javaPreCommit. > > Although this will disable the build cache, it seems we could consider > that to be effectively disabled anyway. E.g. looking into > https://scans.gradle.com/s/eglskrakojhrm/performance/buildCache > we see a local cache miss rate of 98%, other scan I looked into revealed > similar numbers. > > Disabling parallel builds shouldn't be necessary. > > > > On Tue, Jan 8, 2019 at 6:23 PM Maximilian Michels <m...@apache.org> wrote: > >> > I don't think it would be unreasonable to disable parallel build for >> code coverage if necessary, perhaps in a separate Jenkins job if it >> significantly increases job time. >> >> +1 A separate coverage job would be helpful. It would be fine if it ran >> longer >> than the regular PreCommit. >> >> On 08.01.19 10:41, Scott Wegner wrote: >> > I started a PR to see what it would take to upgrade to Gradle 5.0: >> > https://github.com/apache/beam/pull/7402 >> > >> > It seems the main blocker it gogradle plugin compatibility for the Go >> SDK. The >> > gogradle project is actively working on compatibility, so perhaps we >> could check >> > back in a month or so: https://github.com/gogradle/gogradle/pull/270 >> > >> > I can see from the Gradle build scan that our Java pre-commit is using >> parallel >> > build but not the build cache: >> > https://scans.gradle.com/s/eglskrakojhrm/switches I don't think it >> would be >> > unreasonable to disable parallel build for code coverage if necessary, >> perhaps >> > in a separate Jenkins job if it significantly increases job time. >> > >> > On Mon, Jan 7, 2019 at 9:56 AM Kenneth Knowles <k...@apache.org >> > <mailto:k...@apache.org>> wrote: >> > >> > I am pretty sure the build cache is not in use on Jenkins. It would >> have to >> > live somewhere persisted across runs, which I don't think is >> default. And >> > we'd be seeing much lower build times. >> > >> > Kenn >> > >> > On Mon, Jan 7, 2019 at 9:42 AM Mikhail Gryzykhin <mig...@google.com >> > <mailto:mig...@google.com>> wrote: >> > >> > @Jason >> > I believe that we want to get some coverage report as a PR >> pre-commit >> > check, so that we can utilize it as part of code review. >> Removing >> > parallelization increases pre-commit duration greatly (see the >> big drop >> > on this graph >> > < >> http://104.154.241.245/d/_TNndF2iz/pre-commit-test-latency?orgId=1&from=now-6M&to=now >> >) >> > and I believe that's what Michael worries about. >> > >> > If we'd want to run coverage jobs in background on master, then >> it is >> > possible to do a separate job. >> > >> > --Mikhail >> > >> > Have feedback <http://go/migryz-feedback>? >> > >> > >> > On Mon, Jan 7, 2019 at 9:33 AM Jason Kuster < >> jasonkus...@google.com >> > <mailto:jasonkus...@google.com>> wrote: >> > >> > Michael, could we solve that problem by adding an >> additional build >> > with those features disabled specifically for generating >> coverage >> > data over time? Any tradeoffs to doing something like that? >> > >> > On Mon, Jan 7, 2019 at 8:23 AM Michael Luckey < >> adude3...@gmail.com >> > <mailto:adude3...@gmail.com>> wrote: >> > >> > Unfortunately I am a bit worried, that enabling code >> coverage >> > will not come for free at the moment. As far as I >> understand the >> > current state of the build system, we would either have >> to >> > disable parallel builds or the build cache, which I >> assume both >> > to be enabled currently on precommit runs. I am trying >> to get >> > some numbers here, but it might well be, that someone >> did >> > already that evaluation. So I d be happy to get further >> insights >> > if someone with deeper knowledge to the setup could >> jump in here. >> > >> > This situation will probably be resolved after an >> upgrade to >> > gradle5, but unfortunately this seems to be not easily >> doable >> > right now. >> > >> > michel >> > >> > >> > >> > On Mon, Jan 7, 2019 at 4:21 PM Maximilian Michels >> > <m...@apache.org <mailto:m...@apache.org>> wrote: >> > >> > Code coverage would be very useful, regardless of >> what tool >> > we use to measure >> > it. Especially when reviewing PRs, because it >> provides a >> > quantitative measure >> > which can help reviewer and contributor to decide >> if testing >> > is sufficient. >> > >> > I'd prefer to use specific tools instead of >> one-fits-it-all >> > tool but I'm open to >> > whatever works best. The process of setting the >> tooling up >> > is likely going to be >> > iterative. >> > >> > @Mikhail If you want to setup anything and present >> it here, >> > that would be >> > awesome. (Of course discussion can continue here.) >> > >> > On 03.01.19 19:34, Heejong Lee wrote: >> > > I think we should also consider false positive >> ratio of >> > the tool. Oftentimes >> > > deeper analysis easily produces tons of false >> positives >> > which make people less >> > > interested in static analysis results because of >> triaging >> > overheads. >> > > >> > > On Thu, Jan 3, 2019 at 4:18 PM Scott Wegner >> > <sc...@apache.org <mailto:sc...@apache.org> >> > > <mailto:sc...@apache.org <mailto: >> sc...@apache.org>>> wrote: >> > > >> > > Discussion on software engineering practices >> and >> > tools tends to gather many >> > > opinions :) I suggest breaking this out into >> a doc to >> > keep the discussion >> > > organized. >> > > >> > > I appreciate that you've started with a list >> of >> > requirements. I would add a few: >> > > >> > > 6. Analysis results should be integrated >> into the >> > code review workflow. >> > > 7. It should also be possible to run >> analysis and >> > evaluate results locally. >> > > 8. Analysis rules and thresholds should be >> easily >> > configurable. >> > > >> > > And some thoughts on the previous >> requirements: >> > > >> > > > 2. Tool should keep history of reports. >> > > >> > > Seems nice-to-have but not required. I >> believe the >> > most value is viewing the >> > > delta during code review, and also maybe a >> snapshot >> > of the overall state of >> > > master. If we want trends we could also >> import data into >> > > s.apache.org/beam-community-metrics >> > <http://s.apache.org/beam-community-metrics> >> > <http://s.apache.org/beam-community-metrics> >> > > >> > > > 4. Tool should encorporate code coverage >> and >> > static analysis reports. (Or >> > > more if applicable) >> > > >> > > Is the idea to have a single tool >> responsible for all >> > code analysis? We >> > > currently have a variety of tools running in >> our >> > build. It would be >> > > challenging to find a single tool that >> aggregates all >> > current (and future) >> > > analysis, especially considering the >> different >> > language ecosystems. Having >> > > targeted tools responsible for different >> pieces >> > allows us to pick-and-choose >> > > what works best for Beam. >> > > >> > > >> > > On Thu, Jan 3, 2019 at 3:43 PM Mikhail >> Gryzykhin >> > <mig...@google.com <mailto:mig...@google.com> >> > > <mailto:mig...@google.com >> > <mailto:mig...@google.com>>> wrote: >> > > >> > > Let me summarize and answer main >> question that I see: >> > > 1. Seems that we do want to have some >> statistics >> > on coverage and >> > > integrate automatic requirements into >> our build >> > system. >> > > 2. Implementation is still to be >> discussed. >> > > >> > > Lets talk about implementation further. >> > > >> > > My requirements for choice are: >> > > 1. Tool should give us an option for >> deep-dive >> > into findings. >> > > 2. Tool should keep history of reports. >> > > 3. Tool should give an option to break >> build >> > (allow for hardcoded >> > > requirements) >> > > 4. Tool should encorporate code coverage >> and >> > static analysis reports. >> > > (Or more if applicable) >> > > 5. Tool should support most or all >> languages we >> > utilize in beam. >> > > >> > > Let me dive into SonarQube a bit first. >> (All up >> > to my understanding of >> > > how it works.) >> > > Hits most of the points, potentially >> with some >> > tweaks. >> > > This tool relies on reports generated by >> common >> > tools. It also tracks >> > > history of builds and allows to navigate >> it. >> > Multi language. I'm still >> > > working on figuring out how to configure >> it though. >> > > >> > > Common thresholds/checks that are >> suggested by >> > SonarQube: >> > > Many checks are possible to apply to new >> code >> > only. This allows not to >> > > fix legacy code, but keep all new >> additions clean >> > and neat (ish). >> > > Test coverage by line/branch: Relies on >> cubertura >> > report. Usually >> > > coverage by branch is suggested. (all >> "if" case >> > lines should be tested >> > > with positive and negative condition >> result) >> > > Method complexity: Amount of different >> > paths/conditions that method can >> > > be invoked with. Suggested max number is >> 15. >> > Generally describes how >> > > easy it is to test/understand method. >> > > Bugs/vulnerabilities: Generally, output >> of >> > Findbug. Reflects commonly >> > > vulnerable/dangerous code that might >> cause >> > errors. Or just errors in >> > > code. I believe that sonar allows for >> custom code >> > analysis as well, but >> > > that is not required. >> > > Technical debt: estimations on how much >> time will >> > it take to cleanup >> > > code to make it shiny. Includes code >> > duplications, commented code, not >> > > following naming conventions, long >> methods, ifs >> > that can be inverted, >> > > public methods that can be private, etc. >> I'm not >> > familiar with explicit >> > > list, but on my experience suggestions >> are >> > usually relevant. >> > > More on metrics can be found here: >> > > >> > >> https://docs.sonarqube.org/latest/user-guide/metric-definitions/ >> > > >> > > Suggested alternatives: >> > > https://scan.coverity.com/ >> > > This tool looks great and I'll check >> more on it. >> > But it has a >> > > restriction to 14 or 7 builds per week >> (not sure >> > how will they estimate >> > > our project). Also, I'm not sure if we >> can break >> > pre-commit based on >> > > report from coverity. Looks good for >> generating >> > historical data. >> > > >> > > https://docs.codecov.io/docs/browser-extension >> > > I'll check more on this one. Looks great >> to have >> > it integrated in PRs. >> > > Although it requires plugin installation >> by each >> > developer. I don't >> > > think it allows to break builds and only >> does >> > coverage. Am I correct? >> > > >> > > Regards, >> > > --Mikhail >> > > >> > > Have feedback <http://go/migryz-feedback >> >? >> > > >> > > On Thu, Jan 3, 2019 at 2:18 PM Kenneth >> Knowles >> > <k...@apache.org <mailto:k...@apache.org> >> > > <mailto:k...@apache.org >> > <mailto:k...@apache.org>>> wrote: >> > > >> > > It would be very useful to have line >> and/or >> > branch coverage visible. >> > > These are both very weak proxies for >> quality >> > or reliability, so IMO >> > > strict thresholds are not helpful. >> One thing >> > that is super useful is >> > > to integrate line coverage into code >> review, >> > like this: >> > > https://docs.codecov.io/docs/browser-extension. >> It is >> > very easy to >> > > notice major missing tests. >> > > >> > > We have never really used Sonarqube. >> It was >> > turned on as a >> > > possibility in the early days but >> never >> > worked on past that point. >> > > Could be nice. I suspect there's a >> lot to be >> > gained by just finding >> > > very low numbers and improving them. >> So just >> > running Jacoco's >> > > offline HTML generation would do it >> (also >> > this integrates with >> > > Jenkins). I tried this the other day >> and >> > discovered that our gradle >> > > config is broken and does not wire >> tests and >> > coverage reporting >> > > together properly. Last thing: How is >> > "technical debt" measured? I'm >> > > skeptical of quantitative measures >> for >> > qualitative notions. >> > > >> > > Kenn >> > > >> > > On Thu, Jan 3, 2019 at 1:58 PM >> Heejong Lee >> > <heej...@google.com <mailto:heej...@google.com> >> > > <mailto:heej...@google.com >> > <mailto:heej...@google.com>>> wrote: >> > > >> > > I don't have any experience of >> using >> > SonarQube but Coverity >> > > worked well for me. Looks like >> it already >> > has beam repo: >> > > https://scan.coverity.com/projects/11881 >> > > >> > > On Thu, Jan 3, 2019 at 1:27 PM >> Reuven Lax >> > <re...@google.com <mailto:re...@google.com> >> > > <mailto:re...@google.com >> > <mailto:re...@google.com>>> wrote: >> > > >> > > checkstyle and findbugs are >> already >> > run as precommit checks, >> > > are they not? >> > > >> > > On Thu, Jan 3, 2019 at 7:19 >> PM >> > Mikhail Gryzykhin >> > > <mig...@google.com >> > <mailto:mig...@google.com> <mailto: >> mig...@google.com >> > <mailto:mig...@google.com>>> wrote: >> > > >> > > Hi everyone, >> > > >> > > In our current builds we >> (can) >> > run multiple code quality >> > > checks tools like >> checkstyle, >> > findbugs, code test >> > > coverage via cubertura. >> However >> > we do not utilize many >> > > of those signals. >> > > >> > > I suggest to add >> requirements to >> > code based on those >> > > tools. Specifically, I >> suggest to >> > add pre-commit checks >> > > that will require PRs to >> conform >> > to some quality checks. >> > > >> > > We can see good example >> of >> > thresholds to add at Apache >> > > SonarQube provided >> default >> > quality gate config >> > > >> > < >> https://builds.apache.org/analysis/quality_gates/show/1>: >> > > 80% tests coverage on >> new code, >> > > 5% technical technical >> debt on >> > new code, >> > > No bugs/Vulnerabilities >> added. >> > > >> > > As another part of this >> proposal, >> > I want to suggest the >> > > use of SonarQube for >> tracking >> > code statistics and as >> > > agent for enforcing code >> quality >> > thresholds. It is >> > > Apache provided tool >> that has >> > integration with Jenkins >> > > or Gradle via plugins. >> > > >> > > I believe some reporting >> to >> > SonarQube was configured for >> > > mvn builds of some of >> Beam >> > sub-projects, but was lost >> > > during migration to >> gradle. >> > > >> > > I was looking for other >> options, >> > but so far found only >> > > general configs to >> gradle builds >> > that will fail build if >> > > code coverage for >> project is too >> > low. Such approach will >> > > force us to backfill >> tests for >> > all existing code that >> > > can be tedious and demand >> > learning of all legacy code >> > > that might not be part >> of current >> > work. >> > > >> > > I suggest to discuss and >> come to >> > conclusion on two >> > > points in this tread: >> > > 1. Do we want to add >> code quality >> > checks to our >> > > pre-commit jobs and >> require them >> > to pass before PR is >> > > merged? >> > > >> > > Suggested: Add code >> quality >> > checks listed above at >> > > first, adjust them >> as we see >> > fit in the future. >> > > >> > > 2. What tools do we want >> to >> > utilize for analyzing code >> > > quality? >> > > >> > > Under discussion. >> Suggested: >> > SonarQube, but will >> > > depend on >> functionality level >> > we want to achieve. >> > > >> > > >> > > Regards, >> > > --Mikhail >> > > >> > > >> > > >> > > -- >> > > >> > > >> > > >> > > >> > > Got feedback? tinyurl.com/swegner-feedback >> > <http://tinyurl.com/swegner-feedback> >> > > <https://tinyurl.com/swegner-feedback> >> > > >> > >> > >> > >> > -- >> > ------- >> > Jason Kuster >> > Apache Beam / Google Cloud Dataflow >> > >> > See something? Say something. go/jasonkuster-feedback >> <https://goto.google.com/jasonkuster-feedback> >> > <https://goto.google.com/jasonkuster-feedback> >> > >> > >> > >> > -- >> > >> > >> > >> > >> > Got feedback? tinyurl.com/swegner-feedback < >> https://tinyurl.com/swegner-feedback> >> >