[jira] [Resolved] (HDFS-1617) CLONE to COMMON - Batch the calls in DataStorage to FileUtil.createHardLink(), so we call it once per directory instead of once per file

2011-11-23 Thread Konstantin Shvachko (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko resolved HDFS-1617.
---

Resolution: Duplicate

> CLONE to COMMON - Batch the calls in DataStorage to 
> FileUtil.createHardLink(), so we call it once per directory instead of once 
> per file
> 
>
> Key: HDFS-1617
> URL: https://issues.apache.org/jira/browse/HDFS-1617
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node
>Affects Versions: 0.20.2
>Reporter: Matt Foley
>Assignee: Matt Foley
> Fix For: 0.22.0
>
>
> It was a bit of a puzzle why we can do a full scan of a disk in about 30 
> seconds during FSDir() or getVolumeMap(), but the same disk took 11 minutes 
> to do Upgrade replication via hardlinks.  It turns out that the 
> org.apache.hadoop.fs.FileUtil.createHardLink() method does an outcall to 
> Runtime.getRuntime().exec(), to utilize native filesystem hardlink 
> capability.  So it is forking a full-weight external process, and we call it 
> on each individual file to be replicated.
> As a simple check on the possible cost of this approach, I built a Perl test 
> script (under Linux on a production-class datanode).  Perl also uses a 
> compiled and optimized p-code engine, and it has both native support for 
> hardlinks and the ability to do "exec".  
> -  A simple script to create 256,000 files in a directory tree organized like 
> the Datanode, took 10 seconds to run.
> -  Replicating that directory tree using hardlinks, the same way as the 
> Datanode, took 12 seconds using native hardlink support.
> -  The same replication using outcalls to exec, one per file, took 256 
> seconds!
> -  Batching the calls, and doing 'exec' once per directory instead of once 
> per file, took 16 seconds.
> Obviously, your mileage will vary based on the number of blocks per volume.  
> A volume with less than about 4000 blocks will have only 65 directories.  A 
> volume with more than 4K and less than about 250K blocks will have 4200 
> directories (more or less).  And there are two files per block (the data file 
> and the .meta file).  So the average number of files per directory may vary 
> from 2:1 to 500:1.  A node with 50K blocks and four volumes will have 25K 
> files per volume, or an average of about 6:1.  So this change may be expected 
> to take it down from, say, 12 minutes per volume to 2.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] Resolved: (HDFS-1617) CLONE to COMMON - Batch the calls in DataStorage to FileUtil.createHardLink(), so we call it once per directory instead of once per file

2011-02-09 Thread Matt Foley (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley resolved HDFS-1617.
--

  Resolution: Fixed
Release Note:   (was: Batch hardlinking during "upgrade" snapshots, cutting 
time from aprx 8 minutes per volume to aprx 8 seconds.  Validated in both Linux 
and Windows.  Requires coordinated change in both COMMON and HDFS.)

no change.  need to open under COMMON.

> CLONE to COMMON - Batch the calls in DataStorage to 
> FileUtil.createHardLink(), so we call it once per directory instead of once 
> per file
> 
>
> Key: HDFS-1617
> URL: https://issues.apache.org/jira/browse/HDFS-1617
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: data-node
>Affects Versions: 0.20.2
>Reporter: Matt Foley
>Assignee: Matt Foley
> Fix For: 0.22.0
>
>
> It was a bit of a puzzle why we can do a full scan of a disk in about 30 
> seconds during FSDir() or getVolumeMap(), but the same disk took 11 minutes 
> to do Upgrade replication via hardlinks.  It turns out that the 
> org.apache.hadoop.fs.FileUtil.createHardLink() method does an outcall to 
> Runtime.getRuntime().exec(), to utilize native filesystem hardlink 
> capability.  So it is forking a full-weight external process, and we call it 
> on each individual file to be replicated.
> As a simple check on the possible cost of this approach, I built a Perl test 
> script (under Linux on a production-class datanode).  Perl also uses a 
> compiled and optimized p-code engine, and it has both native support for 
> hardlinks and the ability to do "exec".  
> -  A simple script to create 256,000 files in a directory tree organized like 
> the Datanode, took 10 seconds to run.
> -  Replicating that directory tree using hardlinks, the same way as the 
> Datanode, took 12 seconds using native hardlink support.
> -  The same replication using outcalls to exec, one per file, took 256 
> seconds!
> -  Batching the calls, and doing 'exec' once per directory instead of once 
> per file, took 16 seconds.
> Obviously, your mileage will vary based on the number of blocks per volume.  
> A volume with less than about 4000 blocks will have only 65 directories.  A 
> volume with more than 4K and less than about 250K blocks will have 4200 
> directories (more or less).  And there are two files per block (the data file 
> and the .meta file).  So the average number of files per directory may vary 
> from 2:1 to 500:1.  A node with 50K blocks and four volumes will have 25K 
> files per volume, or an average of about 6:1.  So this change may be expected 
> to take it down from, say, 12 minutes per volume to 2.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira