[ https://issues.apache.org/jira/browse/BEAM-3060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16217762#comment-16217762 ]
Chamikara Jayalath commented on BEAM-3060: ------------------------------------------ Thanks for the proposal. Added some comments and assigned to JIRA to you. > Add performance tests for commonly used file-based I/O PTransforms > ------------------------------------------------------------------ > > Key: BEAM-3060 > URL: https://issues.apache.org/jira/browse/BEAM-3060 > Project: Beam > Issue Type: Test > Components: sdk-java-core > Reporter: Chamikara Jayalath > Assignee: Szymon Nieradka > > We recently added a performance testing framework [1] that can be used to do > following. > (1) Execute Beam tests using PerfkitBenchmarker > (2) Manage Kubernetes-based deployments of data stores. > (3) Easily publish benchmark results. > I think it will be useful to add performance tests for commonly used > file-based I/O PTransforms using this framework. I suggest looking into > following formats initially. > (1) AvroIO > (2) TextIO > (3) Compressed text using TextIO > (4) TFRecordIO > It should be possibly to run these tests for various Beam runners (Direct, > Dataflow, Flink, Spark, etc.) and file-systems (GCS, local, HDFS, etc.) > easily. > In the initial version, tests can be made manually triggerable for PRs > through Jenkins. Later, we could make some of these tests run periodically > and publish benchmark results (to BigQuery) through PerfkitBenchmarker. > [1] https://beam.apache.org/documentation/io/testing/ -- This message was sent by Atlassian JIRA (v6.4.14#64029)