[ 
https://issues.apache.org/jira/browse/HDFS-14611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874333#comment-16874333
 ] 

Chen Liang commented on HDFS-14611:
-----------------------------------

[~xkrogen] this is a bit complicated, as explained below.

The way this could break compatibility that, in {{Credentials#readFields}}, the 
entire input stream may contain several tokens, aligned as consecutive raw 
bytes. And this code is not aware of the size on individual tokens, but rely on 
{{t.readFields(in);}} to read just the right number of bytes and do the right 
thing. The current code adds a new field to Token which extends the bytes of 
individual token (in {{Token#write}} and {{Token#readFields}}. So if client and 
server run different versions, one may be sending token with extended bytes, 
one may not be expecting these bytes, the side that's not expecting these bytes 
will not read these bytes from the input in {{Token#readFields}}, causing 
subsequent read from the stream to get misaligned.

The root cause is basically that, again, this code is not aware of total number 
bytes it should take to construct one {{Token}}. But the logic is aware of the 
total size of underlying identifier field so putting this field in as part of 
identifier does not have this issue, this is why it can get resolved. 
Essentially, the new bytes get put into {{Token#identifier}}, as part of the 
identifier itself. When {{Token#readFields}} gets called, it does first read 
the length of the entire identifier, and create a new array (this is important) 
to read these bytes, and parse the identifier on those bytes. Because this part 
does correctly retrieve the whole identifier (which has the new bytes 
included), the rest of the stream stays aligned as expected. 

In this case, if an old binary receives a new token identifier, it can still 
decode all the fields, except that the new bytes are not read, as it does not 
have this logic. If a new binary receivers a old token, it will get a EOF at 
the end of parsing the identifier (the EOF in the patch 
{{BlockTokenIdentifier}}), but this is fine as all the preceding fields are 
parsed, and this does not affect the stream because identifier byte array is 
created from original stream. But either way, it is fine, in the sense that the 
identifier gets recreated as expected, and the rest of the stream is not 
affected.

Marked as blocker as suggested.

> Move handshake secret field from Token to BlockAccessToken
> ----------------------------------------------------------
>
>                 Key: HDFS-14611
>                 URL: https://issues.apache.org/jira/browse/HDFS-14611
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs
>            Reporter: Chen Liang
>            Assignee: Chen Liang
>            Priority: Major
>         Attachments: HDFS-14611.001.patch
>
>
> Currently the handshake secret is included in Token, but conceptually this 
> should belong to Block Access Token only. More importantly, having this field 
> in Token could potentially break compatibility. Moreover, having this field 
> as part of Block Access Token also means we may not need to encrypt this 
> field anymore, because block access token is already encrypted.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to