+1 to remove Perfkit if we can cover what we need without it. One less tool to 'learn/understand/maintain' is always good.
On Fri, Jun 28, 2019 at 5:31 PM Lukasz Cwik <lc...@google.com> wrote: > > +1 for removing tests that are not maintained. > > Are there features in Perfkit that we would like to be using that we aren't? > Can we make the integration with Perfkit less brittle? > > If we aren't getting much and don't plan to get much value in the short term, > removal makes sense to me. > > On Thu, Jun 27, 2019 at 3:16 AM Łukasz Gajowy <lgaj...@apache.org> wrote: >> >> Hi all, >> >> moving the discussion to the dev list: >> https://github.com/apache/beam/pull/8919. I think that Perfkit Benchmarker >> should be removed from all our tests. >> >> Problems that we face currently: >> >> Changes to Gradle tasks/build configuration in the Beam codebase have to be >> reflected in Perfkit code. This required PRs to Perfkit which can last and >> the tests break due to this sometimes (no change in Perfkit + change already >> there in beam = incompatibility). This is what happened in PR 8919 (above), >> Can't run in Python3 (depends on python 2 only library like functools32), >> Black box testing which hard to collect pipeline related metrics, >> Measurement of run time is inaccurate, >> It offers relatively small elasticity in comparison with eg. Jenkins tasks >> in terms of setting up the testing infrastructure (runners, databases). For >> example, if we'd like to setup Flink runner, and reuse it in consequent >> tests in one go, that would be impossible. We can easily do this in Jenkins. >> >> Tests that use Perfkit: >> >> IO integration tests, >> Python performance tests, >> beam_PerformanceTests_Dataflow (disabled), >> beam_PerformanceTests_Spark (failing constantly - looks not maintained). >> >> From the IOIT perspective (1), only the code that setups/tears down >> Kubernetes resources is useful right now but these parts can be easily >> implemented in Jenkins/Gradle code. That would make Perfkit obsolete in IOIT >> because we already collect metrics using Metrics API and store them in >> BigQuery directly. >> >> As for point 2: I have no knowledge of how complex the task would be (help >> needed). >> >> Regarding 3, 4: Those tests seem to be not maintained - should we remove >> them? >> >> Opinions? >> >> Thank you, >> Łukasz >> >> >> >>