[ 
https://issues.apache.org/jira/browse/HADOOP-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HADOOP-3364:
----------------------------------------

    Attachment: fastLoadImage.patch

- Fixed one new find bugs and two old ones.
- Incorporated most of Nicholas's comments.
   Did not make FSImage.uStr  a large  array because UTF8 does not have 
capacity, and because it does not effect the performance.
   Did not remove isRoot. This is the definition of the root.
 Removed getParentINode although it is the method that will replace current 
getParent().
  Refactored FSDirectory/INodeDirectory.addToParent(...) reused duplicate with 
addNode() code.
  LeaseManager was printing logs in info mode which polluted a lot during 
startup and effected the performance. Had to downgrade them to debug mode.


> Faster image and log edits loading.
> -----------------------------------
>
>                 Key: HADOOP-3364
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3364
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: dfs
>    Affects Versions: 0.18.0
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>             Fix For: 0.18.0
>
>         Attachments: fastLoadImage.patch, fastLoadImage.patch
>
>
> This patch optimizes code to provide faster load of fsimage and edits log.
> I implemented ideas mentioned in HADOOP-3022. Namely, removed unnecessary 
> object allocations,
> and implemented optimized loading which avoids unnecessary name-space tree 
> lookups 
> if consecutive files belong to the same parent.
> I changed saveImage algorithm so that it writes first all children of the 
> same directory,
> and then goes inside of its sub-directories. This does not change the format 
> of the image, 
> just changes the order of the stored objects. 
> This should make loading faster after the image is saved with the new version.
> The advantages in performance are
> load/save fsimage: 15-20%
> load edits: 5-10%
> In terms of performance I expected somewhat more from this changes.
> Especially for edits, but it turned out that recent changes substantially 
> slowed down
> edits loading. ADD and CLOSE operations first remove existing file with all 
> its blocks,
> then include at back with potentially new blocks, and then ADD additionally 
> replaces
> just inserted inode by inode-under-construction for the same file.
> This is very inefficient, but hard to fix.
> I'll do it in a separate jira.
> Other changes:
> - I combined most of the UTF8 references in one place at least for FSImage.
> - Included log messages about the startup progress with load/save times and 
> file sizes.
> - Removed pre-crc-upgrade code from FSEdits, which was missed by the 
> crc-remove patch.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to