Yiqun Lin created HDFS-13678:
--------------------------------

             Summary: StorageType is incompatible when rolling upgrade to 
2.6/2.6+ versions
                 Key: HDFS-13678
                 URL: https://issues.apache.org/jira/browse/HDFS-13678
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: rolling upgrades
    Affects Versions: 2.5.0
            Reporter: Yiqun Lin


In version 2.6.0, we supported more storage types in HDFS that implemented in 
HDFS-6584. But this seems a incompatible change when we rolling upgrade our 
cluster from 2.5.0 to 2.6.0 and throw following error.
{noformat}
2018-06-14 11:43:39,246 ERROR [DataNode: 
[[[DISK]file:/home/vipshop/hard_disk/dfs/, [DISK]file:/data1/dfs/, 
[DISK]file:/data2/dfs/]] heartbeating to xx.xx.xx.xx:8022] 
org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService 
for Block pool BP-670256553-xx.xx.xx.xx-1528795419404 (Datanode Uuid 
ab150e05-fcb7-49ed-b8ba-f05c27593fee) service to xx.xx.xx.xx:8022
java.lang.ArrayStoreException
 at java.util.ArrayList.toArray(ArrayList.java:412)
 at java.util.Collections$UnmodifiableCollection.toArray(Collections.java:1034)
 at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1030)
 at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:836)
 at 
org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:146)
 at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:566)
 at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:664)
 at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:835)
 at java.lang.Thread.run(Thread.java:748)
{noformat}
The scenery is that new version NN parses StorageType error that sent from old 
version DN. This is trigger by \{{DNA_TRANSFER}} commands, that is say, if the 
there are under-replicate blocks, then the error appears.

The convert logic is here:
{code:java}
  public static BlockCommand convert(BlockCommandProto blkCmd) {
    List<BlockProto> blockProtoList = blkCmd.getBlocksList();
    Block[] blocks = new Block[blockProtoList.size()];
    ...

    StorageType[][] targetStorageTypes = new StorageType[targetList.size()][];
    List<StorageTypesProto> targetStorageTypesList = 
blkCmd.getTargetStorageTypesList();
    if (targetStorageTypesList.isEmpty()) { // missing storage types
      for(int i = 0; i < targetStorageTypes.length; i++) {
        targetStorageTypes[i] = new StorageType[targets[i].length];
        Arrays.fill(targetStorageTypes[i], StorageType.DEFAULT);
      }
    } else {
      for(int i = 0; i < targetStorageTypes.length; i++) {
        List<StorageTypeProto> p = 
targetStorageTypesList.get(i).getStorageTypesList();
        targetStorageTypes[i] = p.toArray(new StorageType[p.size()]);   <=== 
should do the try-catch 
      }
    }
{code}
A easy fix is that we do the try-catch and fallback to use the default storage 
type when parsed error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to