[jira] [Commented] (HADOOP-13171) Add StorageStatistics to S3A; instrument some more operations

Chris Nauroth (JIRA) Thu, 26 May 2016 17:12:22 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-13171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303230#comment-15303230
 ]


Chris Nauroth commented on HADOOP-13171:
----------------------------------------

The new stats are very thorough.  The refactoring of {{ProgressListener}} and 
pulling encryption logic out of the output streams is nice too.  Thank you, 
Steve!

{{TestS3ADirectoryPerformance#testListOperations}} is timing out for me 
consistently in my runs against US-west-2.  It appears to be making progress, 
but not quickly enough.

{code}
    int scale = (int)getOperationCount();
    int width = 2 * scale;
    int depth = 2 * scale;
    int files = 2 * scale;
{code}

Are you sure this is what you wanted for these factors?  Please check my math 
here, but I believe the interaction between width and depth means the total 
number of directories grows exponentially with respect to scale, something like 
(2*scale) ^ (2*scale) + 1 (for the root).  Then, considering the file count per 
directory, we have a total of ((2*scale) ^ (2*scale) + 1) * (2 * scale) files.  
The default operation count is 2005, so these are big numbers.  Even if I 
down-tune to scale.test.operation.count=4, it still takes a long time.

Please add an {{@Override}} annotation to 
{{ProgressableProgressListener#progressChanged}}.

I think {{ProgressableProgressListener#getLastBytesTransferred}} is no longer 
called anywhere and can be removed.

Maybe I'm missing something, but it appears that after execution of the base 
{{FileSystem#initialize}}, it's impossible for {{FileSystem#statistics}} to be 
null.  Therefore, the null guards in the increment methods should be 
unnecessary.  If you prefer to keep them in the spirit of defensive coding, I'm 
fine with that too.

{code}
    private Iterator<Map.Entry<Statistic, AtomicLong>> iterator =
        opsCount.entrySet().iterator();
{code}

I suggest changing to the following...

{code}
    private Iterator<Map.Entry<Statistic, AtomicLong>> iterator =
        Collections.unmodifiableSet(opsCount.entrySet()).iterator();
{code}

...so that we block callers from removing entries through the iterator.  That's 
probably applicable to the equivalent code up in hadoop-common too, but 
probably best to keep the scope focused just on the hadoop-aws code here.

The JavaDocs for {{S3ATestUtils#createSubdirs}} aren't at the right indentation 
level.

Repeating a comment from HADOOP-13131, {{getS3AFS}} might instead be an 
override of {{getFileSystem}} with a covariant return type.

{code}
  @Test
  public void testCostOfCopyFromLocalFile() throws Throwable {
    describe("testCostOfCopyFromLocalFile");
    File tmpFile = File.createTempFile("tests3acost", ".txt");
{code}

I'd prefer to avoid accessing /tmp due to the potential for platform-specific 
oddities, e.g. HADOOP-761 and MAPREDUCE-5191.  Can we switch to something based 
on {{getContract().getTestPath()}}?


> Add StorageStatistics to S3A; instrument some more operations
> -------------------------------------------------------------
>
>                 Key: HADOOP-13171
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13171
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.0
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>            Priority: Minor
>         Attachments: HADOOP-13171-branch-2-001.patch, 
> HADOOP-13171-branch-2-002.patch, HADOOP-13171-branch-2-003.patch, 
> HADOOP-13171-branch-2-004.patch, HADOOP-13171-branch-2-005.patch, 
> HADOOP-13171-branch-2-006.patch, HADOOP-13171-branch-2-007.patch
>
>
> Add {{StorageStatistics}} support to S3A, collecting the same metrics as the 
> instrumentation, but sharing across all instances.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-13171) Add StorageStatistics to S3A; instrument some more operations

Reply via email to