Yiqun Lin created HDFS-13678: -------------------------------- Summary: StorageType is incompatible when rolling upgrade to 2.6/2.6+ versions Key: HDFS-13678 URL: https://issues.apache.org/jira/browse/HDFS-13678 Project: Hadoop HDFS Issue Type: Bug Components: rolling upgrades Affects Versions: 2.5.0 Reporter: Yiqun Lin
In version 2.6.0, we supported more storage types in HDFS that implemented in HDFS-6584. But this seems a incompatible change when we rolling upgrade our cluster from 2.5.0 to 2.6.0 and throw following error. {noformat} 2018-06-14 11:43:39,246 ERROR [DataNode: [[[DISK]file:/home/vipshop/hard_disk/dfs/, [DISK]file:/data1/dfs/, [DISK]file:/data2/dfs/]] heartbeating to xx.xx.xx.xx:8022] org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in BPOfferService for Block pool BP-670256553-xx.xx.xx.xx-1528795419404 (Datanode Uuid ab150e05-fcb7-49ed-b8ba-f05c27593fee) service to xx.xx.xx.xx:8022 java.lang.ArrayStoreException at java.util.ArrayList.toArray(ArrayList.java:412) at java.util.Collections$UnmodifiableCollection.toArray(Collections.java:1034) at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:1030) at org.apache.hadoop.hdfs.protocolPB.PBHelper.convert(PBHelper.java:836) at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolClientSideTranslatorPB.sendHeartbeat(DatanodeProtocolClientSideTranslatorPB.java:146) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.sendHeartBeat(BPServiceActor.java:566) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.offerService(BPServiceActor.java:664) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:835) at java.lang.Thread.run(Thread.java:748) {noformat} The scenery is that new version NN parses StorageType error that sent from old version DN. This is trigger by \{{DNA_TRANSFER}} commands, that is say, if the there are under-replicate blocks, then the error appears. The convert logic is here: {code:java} public static BlockCommand convert(BlockCommandProto blkCmd) { List<BlockProto> blockProtoList = blkCmd.getBlocksList(); Block[] blocks = new Block[blockProtoList.size()]; ... StorageType[][] targetStorageTypes = new StorageType[targetList.size()][]; List<StorageTypesProto> targetStorageTypesList = blkCmd.getTargetStorageTypesList(); if (targetStorageTypesList.isEmpty()) { // missing storage types for(int i = 0; i < targetStorageTypes.length; i++) { targetStorageTypes[i] = new StorageType[targets[i].length]; Arrays.fill(targetStorageTypes[i], StorageType.DEFAULT); } } else { for(int i = 0; i < targetStorageTypes.length; i++) { List<StorageTypeProto> p = targetStorageTypesList.get(i).getStorageTypesList(); targetStorageTypes[i] = p.toArray(new StorageType[p.size()]); <=== should do the try-catch } } {code} A easy fix is that we do the try-catch and fallback to use the default storage type when parsed error. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org