[ https://issues.apache.org/jira/browse/HADOOP-16058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16746724#comment-16746724 ]
Steve Loughran commented on HADOOP-16058: ----------------------------------------- {code} bin/hadoop fs -cat s3a://hwdev-steve-ireland-new/terasort-ITestMagicTerasort/results.csv "Operation" "Duration" "Generate" "0:22.854s" "Terasort" "0:30.228s" "Validate" "0:27.682s" "Completed" "1:27.840s" {code} Directory staging committer: {code} bin/hadoop fs -cat s3a://hwdev-steve-ireland-new/terasort-ITestDirectoryTerasort/results.csv "Operation" "Duration" "Generate" "0:22.111s" "Terasort" "0:24.613s" "Validate" "0:24.504s" "Completed" "1:19.135s" {code} The client is a laptop, store S3 ireland a few hundred millis away; the latency of S3 calls means that the magic committer, which uses S3 over a miniHDFS cluster, suffers a lot from the latency. You'd need to be creating larger files for the incremental write to become relevant > S3A tests to include Terasort > ----------------------------- > > Key: HADOOP-16058 > URL: https://issues.apache.org/jira/browse/HADOOP-16058 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3, test > Affects Versions: 3.3.0 > Reporter: Steve Loughran > Assignee: Steve Loughran > Priority: Major > > Add S3A tests to run terasort for the magic and directory committers. > MAPREDUCE-7091 is a requirement for this > Bonus feature: print the results to see which committers are faster in the > specific test setup. As that's a function of latency to the store, bandwidth > and size of jobs, it's not at all meaningful, just interesting. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org