[ https://issues.apache.org/jira/browse/HADOOP-13551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16923358#comment-16923358 ]
Steve Loughran commented on HADOOP-13551: ----------------------------------------- # the way to do this is by passing in a RequestMetricCollector to the s3 client constructor; this will be invoked before/after each call and can collect and publish metrics # collected metrics are in com.amazonaws.util.AWSRequestMetrics # this does include throttling retries as well as performance on operations (including time to request a signature and sign requests) # and SDK wide state like https pool capacity Presumably the way to deal with pool capacity is to have a value which is updated on every response; it will jitter a lot when there are many requests being made. * Timing values for common setup/sign operations independent of data size could go to a quantile * we currently only count bytes PUT, some of the input stream values * NOT bytes copied * and input stream bytes read are mapped to wrong statistic, STREAM_SEEK_BYTES_READ [("stream_bytes_read", "Count of bytes read during seek() in stream operations"), Proposed: we collect this stuff. serve up for apps like Impala to collect > hook up AwsSdkMetrics to hadoop metrics > --------------------------------------- > > Key: HADOOP-13551 > URL: https://issues.apache.org/jira/browse/HADOOP-13551 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.0.0-beta1 > Reporter: Steve Loughran > Assignee: Sean Mackrory > Priority: Major > > There's an API in {{com.amazonaws.metrics.AwsSdkMetrics}} to give access to > the internal metrics of the AWS libraries. We might want to get at those -- This message was sent by Atlassian Jira (v8.3.2#803003) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org