[ 
https://issues.apache.org/jira/browse/HDFS-15811?focusedWorklogId=548878&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-548878
 ]

ASF GitHub Bot logged work on HDFS-15811:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 05/Feb/21 22:18
            Start Date: 05/Feb/21 22:18
    Worklog Time Spent: 10m 
      Work Description: zehaoc2 commented on a change in pull request #2670:
URL: https://github.com/apache/hadoop/pull/2670#discussion_r571280969



##########
File path: 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
##########
@@ -3146,23 +3148,30 @@ INodeFile checkLease(INodesInPath iip, String holder, 
long fileId)
   boolean completeFile(final String src, String holder,
                        ExtendedBlock last, long fileId)
     throws IOException {
+    final String operationName = CMD_COMPLETE_FILE;
     boolean success = false;
+    FileStatus stat = null;
     checkOperation(OperationCategory.WRITE);
     final FSPermissionChecker pc = getPermissionChecker();
     FSPermissionChecker.setOperationType(null);
     writeLock();
     try {
       checkOperation(OperationCategory.WRITE);
       checkNameNodeSafeMode("Cannot complete file " + src);
-      success = FSDirWriteFileOp.completeFile(this, pc, src, holder, last,
+      INodesInPath iip = dir.resolvePath(pc, src, fileId);
+      success = FSDirWriteFileOp.completeFile(this, iip, src, holder, last,
                                               fileId);
+      if (success) {
+        stat = dir.getAuditFileInfo(iip);
+      }
     } finally {
-      writeUnlock("completeFile");
+      writeUnlock(operationName);

Review comment:
       Sorry. I should change this to "complete" instead of "close". I change 
this because the audit log cmd names usually mimic the client api names rather 
than the rpc method name. For instance, rpc method "startFile" is audit logged 
as cmd "create". 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 548878)
    Time Spent: 40m  (was: 0.5h)

> completeFile should log final file size
> ---------------------------------------
>
>                 Key: HDFS-15811
>                 URL: https://issues.apache.org/jira/browse/HDFS-15811
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Zehao Chen
>            Assignee: Zehao Chen
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Jobs, particularly hive queries by non-headless users, can create an 
> excessive number of files (many hundreds of thousands). A single user's query 
> can generate a sustained burst of 60-80% of all creates for tens of minutes 
> or more and impact overall cluster performance. Adding the file size to the 
> logline allows us to identify excessive tiny or large files.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to