[ 
https://issues.apache.org/jira/browse/HDFS-16644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17754420#comment-17754420
 ] 

Zilong Zhu edited comment on HDFS-16644 at 8/15/23 6:03 AM:
------------------------------------------------------------

We‘ve also encountered this issue. Our NN and DN is Hadoop 3.2.4 version, and 
client version is 2.10.1.  For the same code segment, if only "hadoop-client" 
included in the pom.xml, it works fine. However, if both "hadoop-client" and 
"hadoop-hdfs" are included, issues arise.

We believe this issue is related to class loading and protocols. It leads to 
the generation of abnormal QOP value(e.g.D).  The key  to this issue lies in 
the handing of the accessToken's BlockTokenIdentifier. 

NN(3.2.4) serialized and sent the accessToken to the  client(2.10.1).  The 
client(2.10.1) deserialized the accessToken(3.2.4). At this point, some fields 
changed.

For BlockTokenIdentifier(3.2.4) 
org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier#writeLegacy
{code:java}
void writeLegacy(DataOutput out) throws IOException {
  WritableUtils.writeVLong(out, expiryDate);
  WritableUtils.writeVInt(out, keyId);
  WritableUtils.writeString(out, userId);
  WritableUtils.writeString(out, blockPoolId);
  WritableUtils.writeVLong(out, blockId);
  WritableUtils.writeVInt(out, modes.size());
  for (AccessMode aMode : modes) {
    WritableUtils.writeEnum(out, aMode);
  }
  if (storageTypes != null) {    <============ new fields
    WritableUtils.writeVInt(out, storageTypes.length);
    for (StorageType type : storageTypes) {
      WritableUtils.writeEnum(out, type);
    }
  }
  if (storageIds != null) {      <============ new fields
    WritableUtils.writeVInt(out, storageIds.length);
    for (String id : storageIds) {
      WritableUtils.writeString(out, id);
    }
  }
  if (handshakeMsg != null && handshakeMsg.length > 0) {
    WritableUtils.writeVInt(out, handshakeMsg.length);
    out.write(handshakeMsg);
  }
}{code}
For BlockTokenIdentifier(2.10.1) 
org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier#readFields

 
{code:java}
public void readFields(DataInput in) throws IOException {
  this.cache = null;
  if (in instanceof DataInputStream) {
    final DataInputStream dis = (DataInputStream) in;
    // this.cache should be assigned the raw bytes from the input data for
    // upgrading compatibility. If we won't mutate fields and call getBytes()
    // for something (e.g retrieve password), we should return the raw bytes
    // instead of serializing the instance self fields to bytes, because we
    // may lose newly added fields which we can't recognize.
    this.cache = IOUtils.readFullyToByteArray(dis);
    dis.reset();
  }
  expiryDate = WritableUtils.readVLong(in);
  keyId = WritableUtils.readVInt(in);
  userId = WritableUtils.readString(in);
  blockPoolId = WritableUtils.readString(in);
  blockId = WritableUtils.readVLong(in);
  int length = WritableUtils.readVIntInRange(in, 0,
      AccessMode.class.getEnumConstants().length);
  for (int i = 0; i < length; i++) {
    modes.add(WritableUtils.readEnum(in, AccessMode.class));
  }
  try {
    int handshakeMsgLen = WritableUtils.readVInt(in);
    if (handshakeMsgLen != 0) {
      handshakeMsg = new byte[handshakeMsgLen];
      in.readFully(handshakeMsg);
    }
  } catch (EOFException eof) {

  }
} {code}
So, when client(2.10.1) deserialized the handshakeMsg, an error occurred and it 
mistakenly deserialized the storageType instead of the handshakeMsg.

HDFS-13541 merged into both branch-2.10 and branch-3.2. It added the 
"handshakeMsg" field. But HDFS-6708 and HDFS-9807 merged into branch-3.2 only. 
It added the "storageTypes" and "storageIds" fields before HDFS-13541. This is 
where the real issus lies.

I want to fix this issus. Any comments and suggestions would be appreciated.

 


was (Author: JIRAUSER287487):
We‘ve also encountered this issue. Our NN and DN is Hadoop 3.2.4 version, and 
client version is 2.10.1.  For the same code segment, if only "hadoop-client" 
included in the pom.xml, it works fine. However, if both "hadoop-client" and 
"hadoop-hdfs" are included, issues arise.

We believe this issue is related to class loading and protocols. It leads to 
the generation of abnormal QOP value(e.g.D).  The key  to this issue lies in 
the handing of the accessToken's BlockTokenIdentifier. 

NN(3.2.4) serialized and sent the accessToken to the  client(2.10.1).  The 
client(2.10.1) deserialized the accessToken(3.2.4). At this point, some fields 
changed.

For BlockTokenIdentifier(3.2.4) 
org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier#writeLegacy
{code:java}
void writeLegacy(DataOutput out) throws IOException {
  WritableUtils.writeVLong(out, expiryDate);
  WritableUtils.writeVInt(out, keyId);
  WritableUtils.writeString(out, userId);
  WritableUtils.writeString(out, blockPoolId);
  WritableUtils.writeVLong(out, blockId);
  WritableUtils.writeVInt(out, modes.size());
  for (AccessMode aMode : modes) {
    WritableUtils.writeEnum(out, aMode);
  }
  if (storageTypes != null) {    <============ new fields
    WritableUtils.writeVInt(out, storageTypes.length);
    for (StorageType type : storageTypes) {
      WritableUtils.writeEnum(out, type);
    }
  }
  if (storageIds != null) {      <============ new fields
    WritableUtils.writeVInt(out, storageIds.length);
    for (String id : storageIds) {
      WritableUtils.writeString(out, id);
    }
  }
  if (handshakeMsg != null && handshakeMsg.length > 0) {
    WritableUtils.writeVInt(out, handshakeMsg.length);
    out.write(handshakeMsg);
  }
}{code}
For BlockTokenIdentifier(2.10.1) 
org.apache.hadoop.hdfs.security.token.block.BlockTokenIdentifier#readFields

 
{code:java}
public void readFields(DataInput in) throws IOException {
  this.cache = null;
  if (in instanceof DataInputStream) {
    final DataInputStream dis = (DataInputStream) in;
    // this.cache should be assigned the raw bytes from the input data for
    // upgrading compatibility. If we won't mutate fields and call getBytes()
    // for something (e.g retrieve password), we should return the raw bytes
    // instead of serializing the instance self fields to bytes, because we
    // may lose newly added fields which we can't recognize.
    this.cache = IOUtils.readFullyToByteArray(dis);
    dis.reset();
  }
  expiryDate = WritableUtils.readVLong(in);
  keyId = WritableUtils.readVInt(in);
  userId = WritableUtils.readString(in);
  blockPoolId = WritableUtils.readString(in);
  blockId = WritableUtils.readVLong(in);
  int length = WritableUtils.readVIntInRange(in, 0,
      AccessMode.class.getEnumConstants().length);
  for (int i = 0; i < length; i++) {
    modes.add(WritableUtils.readEnum(in, AccessMode.class));
  }
  try {
    int handshakeMsgLen = WritableUtils.readVInt(in);
    if (handshakeMsgLen != 0) {
      handshakeMsg = new byte[handshakeMsgLen];
      in.readFully(handshakeMsg);
    }
  } catch (EOFException eof) {

  }
} {code}
So, when client(2.10.1) deserialized the handshakeMsg, an error occurred and it 
mistakenly deserialized the storageType instead of the handshakeMsg.

HDFS-13541 merged into both branch-2.10 and branch-3.2. It added the 
"handshakeMsg" field. But HDFS-6708 and HDFS-9807 merged into branch-3.2 only. 
It added the "storageTypes" and "storageIds" fields before HDFS-13531. This is 
where the real issus lies.

I want to fix this issus. Any comments and suggestions would be appreciated.

 

> java.io.IOException Invalid token in javax.security.sasl.qop
> ------------------------------------------------------------
>
>                 Key: HDFS-16644
>                 URL: https://issues.apache.org/jira/browse/HDFS-16644
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 3.2.1
>            Reporter: Walter Su
>            Priority: Major
>
> deployment:
> server side: kerberos enabled cluster with jdk 1.8 and hdfs-server 3.2.1
> client side:
> I run command hadoop fs -put a test file, with kerberos ticket inited first, 
> and use identical core-site.xml & hdfs-site.xml configuration.
>  using client ver 3.2.1, it succeeds.
>  using client ver 2.8.5, it succeeds.
>  using client ver 2.10.1, it fails. The client side error info is:
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.SaslDataTransferClient: 
> SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = 
> false
> 2022-06-27 01:06:15,781 ERROR 
> org.apache.hadoop.hdfs.server.datanode.DataNode: 
> DataNode{data=FSDataset{dirpath='[/mnt/disk1/hdfs, /mnt/***/hdfs, 
> /mnt/***/hdfs, /mnt/***/hdfs]'}, localName='emr-worker-***.***:9866', 
> datanodeUuid='b1c7f64a-6389-4739-bddf-***', xmitsInProgress=0}:Exception 
> transfering block BP-1187699012-10.****-***:blk_1119803380_46080919 to mirror 
> 10.*****:9866
> java.io.IOException: Invalid token in javax.security.sasl.qop: D
>         at 
> org.apache.hadoop.hdfs.protocol.datatransfer.sasl.DataTransferSaslUtil.readSaslMessage(DataTransferSaslUtil.java:220)
> Once any client ver 2.10.1 connect to hdfs server, the DataNode no longer 
> accepts any client connection, even client ver 3.2.1 cannot connects to hdfs 
> server. The DataNode rejects any client connection. For a short time, all 
> DataNodes rejects client connections. 
> The problem exists even if I replace DataNode with ver 3.3.0 or replace java 
> with jdk 11.
> The problem is fixed if I replace DataNode with ver 3.2.0. I guess the 
> problem is related to HDFS-13541



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to