[jira] [Updated] (HDFS-4988) Datanode must support all the volumes as individual storages

Arpit Agarwal (JIRA) Wed, 25 Sep 2013 22:05:33 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-4988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Arpit Agarwal updated HDFS-4988:
--------------------------------

    Attachment: HDFS-4988.08.patch

{quote}
It seems that StorageDirectory.storageID should be moved to DataStorage.
{quote}
The DataStorage and StorageDirectory class organization is not easy to work 
with. There is only one DataStorage object and that is used to manage all the 
StorageDirectories. I don't see how to make StorageID a member of DataStorage 
since it is different for each StorageDirectory. Do you mind if we leave it 
this way for now, it would require refactoring of unrelated code to fix it. 
Maybe we can do it in trunk later.


{quote}
In DataStorage, should cachedDatanodeUuid be moved to DataNode? Also, would 
calling it as "datanodeUuid" good enough? For example, we use storageID (or 
storageUuid) but not cachedStorageID.
{quote}
I think it needs to be a part of DataStorage. It is read from the VERSION file, 
and needs an field like the remaining entries (I renamed it to DatanodeUuid)


{quote}
In new FsVolumeSpi.getStorageID() method, do you want to call it 
getStorageUuid()?
{quote}
Yes I would like to rename it when we rename it everywhere else. I wanted it to 
be consistent as much as possible for now. I  filed HDFS-5264 for this.


{quote}
For the new perVolumeReplicaMap, it may be better to change ReplicaMap or add a 
new class (say VolumeReplicaMap, which has a volume-to-ReplicaMap map) to 
support the per-volume feature.
{quote}
Can I do this in a separate Jira? I agree with you it needs cleanup. I'll fix 
it in HDFS-2832.


{quote}
The old FsDatasetSpi.getBlockReport(String bpid) method should be removed. It 
is only used in tests.
{quote}
Do you mind if I remove it in a separate checkin when I fix the tests?

{quote}
I suggest not to manually fix the imports. We may use some IDE tools to fix 
them (e.g. cmd-shit-o in eclipse.)
{quote}
This was done automatically by IntelliJ probably because new imports made the 
list too long. I think it is a reasonable here since the list of imports would 
be edited anyway.

Addressed all other remaining comments from both, thanks for the reviews 
Nicholas and Junping.
                
> Datanode must support all the volumes as individual storages
> ------------------------------------------------------------
>
>                 Key: HDFS-4988
>                 URL: https://issues.apache.org/jira/browse/HDFS-4988
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>            Reporter: Suresh Srinivas
>            Assignee: Arpit Agarwal
>         Attachments: HDFS-4988.01.patch, HDFS-4988.02.patch, 
> HDFS-4988.05.patch, HDFS-4988.06.patch, HDFS-4988.07.patch, HDFS-4988.08.patch
>
>
> Currently all the volumes on datanode is reported as a single storage. This 
> change proposes reporting them as individual storage. This requires:
> # A unique storage ID for each storage
> #* This needs to be generated during formatting
> # There should be an option to allow existing disks to be reported as single 
> storage unit for backward compatibility.
> # A functionality is also needed to split the existing all volumes as single 
> storage unit to to individual storage units.
> # -Configuration must allow for each storage unit a storage type attribute. 
> (Now HDFS-5000)-
> # Block reports must be sent on a per storage basis. In some cases (such 
> memory tier) block reports may need to be sent more frequently. That means 
> block reporting period must be on a per storage type basis.
> My proposal is for new clusters to configure volumes by default as separate 
> storage unit. Lets discuss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-4988) Datanode must support all the volumes as individual storages

Reply via email to