[ https://issues.apache.org/jira/browse/BEAM-6627?focusedWorklogId=201689&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-201689 ]
ASF GitHub Bot logged work on BEAM-6627: ---------------------------------------- Author: ASF GitHub Bot Created on: 20/Feb/19 23:40 Start Date: 20/Feb/19 23:40 Worklog Time Spent: 10m Work Description: udim commented on pull request #7772: [BEAM-6627] Added Metrics API processing time reporting to TextIOIT URL: https://github.com/apache/beam/pull/7772#discussion_r258724722 ########## File path: sdks/java/io/file-based-io-tests/src/test/java/org/apache/beam/sdk/io/text/TextIOIT.java ########## @@ -127,28 +140,49 @@ public void writeThenReadAll() { PipelineResult result = pipeline.run(); result.waitUntilFinish(); - publishGcsResults(result); + gatherAndPublishMetrics(result); } - private void publishGcsResults(PipelineResult result) { + private void gatherAndPublishMetrics(PipelineResult result) { + String uuid = UUID.randomUUID().toString(); + Timestamp timestamp = Timestamp.now(); + List<NamedTestResult> namedTestResults = readMetrics(result, uuid, timestamp); + publishToBigQuery(namedTestResults, bigQueryDataset, bigQueryTable); + ConsoleResultPublisher.publish(namedTestResults, uuid, timestamp.toString()); + } + + private List<NamedTestResult> readMetrics( + PipelineResult result, String uuid, Timestamp timestamp) { + List<NamedTestResult> results = new ArrayList<>(); + + MetricsReader reader = new MetricsReader(result, FILEIOIT_NAMESPACE); + long writeStartTime = reader.getStartTimeMetric("startTime"); + long writeEndTime = reader.getEndTimeMetric("middleTime"); + long readStartTime = reader.getStartTimeMetric("middleTime"); + long readEndTime = reader.getEndTimeMetric("endTime"); + double writeTime = (writeEndTime - writeStartTime) / 1000.0; + double readTime = (readEndTime - readStartTime) / 1000.0; + double copiesPerSec = calculateGcsMetric(result); + + if (copiesPerSec > 0) { + results.add( + NamedTestResult.create(uuid, timestamp.toString(), "copies_per_sec", copiesPerSec)); + } + + results.add(NamedTestResult.create(uuid, timestamp.toString(), "read_time", readTime)); + results.add(NamedTestResult.create(uuid, timestamp.toString(), "write_time", writeTime)); + + return results; + } + + private double calculateGcsMetric(PipelineResult result) { Review comment: I guess the way it was working before was by checking `if (numCopies < 0 || copyTimeMsec < 0)`, which is false if `options.getGcsPerformanceMetrics()` is false. So I think we can merge --reportGcsPerformanceMetrics and --gcsPerformanceMetrics into the latter. No need to add a separate flag, and only report copiesPerSec if --gcsPerformanceMetrics is set. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 201689) Time Spent: 4h (was: 3h 50m) > Use Metrics API in IO performance tests > --------------------------------------- > > Key: BEAM-6627 > URL: https://issues.apache.org/jira/browse/BEAM-6627 > Project: Beam > Issue Type: Improvement > Components: testing > Reporter: Michal Walenia > Assignee: Michal Walenia > Priority: Minor > Time Spent: 4h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)