[jira] [Updated] (HDFS-5667) StorageType and State in DatanodeStorageInfo in NameNode is not accurate

2013-12-13 Thread Eric Sirianni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Sirianni updated HDFS-5667:


Description: 
The fix for HDFS-5484 was accidentally regressed by the following change made 
via HDFS-5542
{code}
+  DatanodeStorageInfo updateStorage(DatanodeStorage s) {
 synchronized (storageMap) {
   DatanodeStorageInfo storage = storageMap.get(s.getStorageID());
   if (storage == null) {
@@ -670,8 +658,6 @@
   for DN  + getXferAddr());
 storage = new DatanodeStorageInfo(this, s);
 storageMap.put(s.getStorageID(), storage);
-  } else {
-storage.setState(s.getState());
   }
   return storage;
 }
{code}

By removing the 'else' and no longer updating the state in the BlockReport 
processing path, we effectively get the bogus state  type that is set via the 
first heartbeat (see the fix for HDFS-5455):

{code}
+  if (storage == null) {
+// This is seen during cluster initialization when the heartbeat
+// is received before the initial block reports from each storage.
+storage = updateStorage(new DatanodeStorage(report.getStorageID()));
{code}
Even reverting the change and reintroducing the 'else' leaves the state  type 
temporarily inaccurate until the first block report. 

As discussed with [~arpitagarwal], a better fix would be to simply include the 
full DatanodeStorage object in the StorageReport (as opposed to only the 
Storage ID).  This requires adding the {{DatanodeStorage}} object to 
{{StorageReportProto}}. It needs to be a new optional field and we cannot 
remove the existing {{StorageUuid}} for protocol compatibility.

  was:
The fields in DatanodeStorageInfo are updated from two distinct paths:
# block reports
# storage reports (via heartbeats)

The {{state}} and {{storageType}} fields are updated via the Block Report.  
However, as seen in the code blow, these fields are populated from a dummy 
{{DatanodeStorage}} object constructed in the DataNode:
{code}
BPServiceActor.blockReport() {
//...
// Dummy DatanodeStorage object just for sending the block report.
DatanodeStorage dnStorage = new DatanodeStorage(storageID);
//...
}
{code}

The net effect is that the {{state}} and {{storageType}} fields are always the 
default of {{NORMAL}} and {{DISK}} in the NameNode.

The recommended fix is to change {{FsDatasetSpi.getBlockReports()}} from:
{code}
public MapString, BlockListAsLongs getBlockReports(String bpid);
{code}
to:
{code}
public MapDatanodeStorage, BlockListAsLongs getBlockReports(String bpid);
{code}
thereby allowing {{BPServiceActor}} to send the real {{DatanodeStorage}} 
object with the block report.


 StorageType and State in DatanodeStorageInfo in NameNode is not accurate
 

 Key: HDFS-5667
 URL: https://issues.apache.org/jira/browse/HDFS-5667
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: Eric Sirianni
 Fix For: Heterogeneous Storage (HDFS-2832)


 The fix for HDFS-5484 was accidentally regressed by the following change made 
 via HDFS-5542
 {code}
 +  DatanodeStorageInfo updateStorage(DatanodeStorage s) {
  synchronized (storageMap) {
DatanodeStorageInfo storage = storageMap.get(s.getStorageID());
if (storage == null) {
 @@ -670,8 +658,6 @@
for DN  + getXferAddr());
  storage = new DatanodeStorageInfo(this, s);
  storageMap.put(s.getStorageID(), storage);
 -  } else {
 -storage.setState(s.getState());
}
return storage;
  }
 {code}
 By removing the 'else' and no longer updating the state in the BlockReport 
 processing path, we effectively get the bogus state  type that is set via 
 the first heartbeat (see the fix for HDFS-5455):
 {code}
 +  if (storage == null) {
 +// This is seen during cluster initialization when the heartbeat
 +// is received before the initial block reports from each storage.
 +storage = updateStorage(new DatanodeStorage(report.getStorageID()));
 {code}
 Even reverting the change and reintroducing the 'else' leaves the state  
 type temporarily inaccurate until the first block report. 
 As discussed with [~arpitagarwal], a better fix would be to simply include 
 the full DatanodeStorage object in the StorageReport (as opposed to only the 
 Storage ID).  This requires adding the {{DatanodeStorage}} object to 
 {{StorageReportProto}}. It needs to be a new optional field and we cannot 
 remove the existing {{StorageUuid}} for protocol compatibility.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Updated] (HDFS-5667) StorageType and State in DatanodeStorageInfo in NameNode is not accurate

2013-12-13 Thread Eric Sirianni (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Sirianni updated HDFS-5667:


Description: 
The fix for HDFS-5484 was accidentally regressed by the following change made 
via HDFS-5542
{code}
+  DatanodeStorageInfo updateStorage(DatanodeStorage s) {
 synchronized (storageMap) {
   DatanodeStorageInfo storage = storageMap.get(s.getStorageID());
   if (storage == null) {
@@ -670,8 +658,6 @@
   for DN  + getXferAddr());
 storage = new DatanodeStorageInfo(this, s);
 storageMap.put(s.getStorageID(), storage);
-  } else {
-storage.setState(s.getState());
   }
   return storage;
 }
{code}

By removing the 'else' and no longer updating the state in the BlockReport 
processing path, we effectively get the bogus state  type that is set via the 
first heartbeat (see the fix for HDFS-5455):

{code}
+  if (storage == null) {
+// This is seen during cluster initialization when the heartbeat
+// is received before the initial block reports from each storage.
+storage = updateStorage(new DatanodeStorage(report.getStorageID()));
{code}
Even reverting the change and reintroducing the 'else' leaves the state  type 
temporarily inaccurate until the first block report. 

As discussed with [~arpitagarwal], a better fix would be to simply include the 
full {{DatanodeStorage}} object in the {{StorageReport}} (as opposed to only 
the Storage ID).  This requires adding the {{DatanodeStorage}} object to 
{{StorageReportProto}}. It needs to be a new optional field and we cannot 
remove the existing {{StorageUuid}} for protocol compatibility.

  was:
The fix for HDFS-5484 was accidentally regressed by the following change made 
via HDFS-5542
{code}
+  DatanodeStorageInfo updateStorage(DatanodeStorage s) {
 synchronized (storageMap) {
   DatanodeStorageInfo storage = storageMap.get(s.getStorageID());
   if (storage == null) {
@@ -670,8 +658,6 @@
   for DN  + getXferAddr());
 storage = new DatanodeStorageInfo(this, s);
 storageMap.put(s.getStorageID(), storage);
-  } else {
-storage.setState(s.getState());
   }
   return storage;
 }
{code}

By removing the 'else' and no longer updating the state in the BlockReport 
processing path, we effectively get the bogus state  type that is set via the 
first heartbeat (see the fix for HDFS-5455):

{code}
+  if (storage == null) {
+// This is seen during cluster initialization when the heartbeat
+// is received before the initial block reports from each storage.
+storage = updateStorage(new DatanodeStorage(report.getStorageID()));
{code}
Even reverting the change and reintroducing the 'else' leaves the state  type 
temporarily inaccurate until the first block report. 

As discussed with [~arpitagarwal], a better fix would be to simply include the 
full DatanodeStorage object in the StorageReport (as opposed to only the 
Storage ID).  This requires adding the {{DatanodeStorage}} object to 
{{StorageReportProto}}. It needs to be a new optional field and we cannot 
remove the existing {{StorageUuid}} for protocol compatibility.


 StorageType and State in DatanodeStorageInfo in NameNode is not accurate
 

 Key: HDFS-5667
 URL: https://issues.apache.org/jira/browse/HDFS-5667
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Affects Versions: Heterogeneous Storage (HDFS-2832)
Reporter: Eric Sirianni
 Fix For: Heterogeneous Storage (HDFS-2832)


 The fix for HDFS-5484 was accidentally regressed by the following change made 
 via HDFS-5542
 {code}
 +  DatanodeStorageInfo updateStorage(DatanodeStorage s) {
  synchronized (storageMap) {
DatanodeStorageInfo storage = storageMap.get(s.getStorageID());
if (storage == null) {
 @@ -670,8 +658,6 @@
for DN  + getXferAddr());
  storage = new DatanodeStorageInfo(this, s);
  storageMap.put(s.getStorageID(), storage);
 -  } else {
 -storage.setState(s.getState());
}
return storage;
  }
 {code}
 By removing the 'else' and no longer updating the state in the BlockReport 
 processing path, we effectively get the bogus state  type that is set via 
 the first heartbeat (see the fix for HDFS-5455):
 {code}
 +  if (storage == null) {
 +// This is seen during cluster initialization when the heartbeat
 +// is received before the initial block reports from each storage.
 +storage = updateStorage(new DatanodeStorage(report.getStorageID()));
 {code}
 Even reverting the change and reintroducing the 'else' leaves the state  
 type temporarily inaccurate until the first block report.