[ https://issues.apache.org/jira/browse/BEAM-9901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102041#comment-17102041 ]
Ismaël Mejía commented on BEAM-9901: ------------------------------------ Thanks, It is great if we get an intern to advance this subject even if not totally complete, count on me for extra advice/reviews if needed. I suggest to ignore the generator part and focus on advancing the write and execution of the quries. Java's generator can be used in a first instance to export data into both Pubsub and GCS. Also running the queries from the generated data would give us a more standard baseline to compare performance of similar queries between portable and non portable runners. > Beam python nexmark benchmark suite > ----------------------------------- > > Key: BEAM-9901 > URL: https://issues.apache.org/jira/browse/BEAM-9901 > Project: Beam > Issue Type: Task > Components: benchmarking-py, testing-nexmark > Reporter: Yichi Zhang > Priority: Major > Fix For: Not applicable > > > Nexmark is a suite of queries (pipelines) used to measure performance and > non-regression in Beam. Currently it exists in java sdk: > [https://github.com/apache/beam/tree/master/sdks/java/testing/nexmark/src/main/java/org/apache/beam/sdk/nexmark] > In this project we would like to create the nexmark benchmark suite in python > sdk equivalent to what BEAM has for java. This allows us to determine > performance impact on pull requests for python pipelines. -- This message was sent by Atlassian Jira (v8.3.4#803005)