[ https://issues.apache.org/jira/browse/HDFS-14611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16874333#comment-16874333 ]
Chen Liang commented on HDFS-14611: ----------------------------------- [~xkrogen] this is a bit complicated, as explained below. The way this could break compatibility that, in {{Credentials#readFields}}, the entire input stream may contain several tokens, aligned as consecutive raw bytes. And this code is not aware of the size on individual tokens, but rely on {{t.readFields(in);}} to read just the right number of bytes and do the right thing. The current code adds a new field to Token which extends the bytes of individual token (in {{Token#write}} and {{Token#readFields}}. So if client and server run different versions, one may be sending token with extended bytes, one may not be expecting these bytes, the side that's not expecting these bytes will not read these bytes from the input in {{Token#readFields}}, causing subsequent read from the stream to get misaligned. The root cause is basically that, again, this code is not aware of total number bytes it should take to construct one {{Token}}. But the logic is aware of the total size of underlying identifier field so putting this field in as part of identifier does not have this issue, this is why it can get resolved. Essentially, the new bytes get put into {{Token#identifier}}, as part of the identifier itself. When {{Token#readFields}} gets called, it does first read the length of the entire identifier, and create a new array (this is important) to read these bytes, and parse the identifier on those bytes. Because this part does correctly retrieve the whole identifier (which has the new bytes included), the rest of the stream stays aligned as expected. In this case, if an old binary receives a new token identifier, it can still decode all the fields, except that the new bytes are not read, as it does not have this logic. If a new binary receivers a old token, it will get a EOF at the end of parsing the identifier (the EOF in the patch {{BlockTokenIdentifier}}), but this is fine as all the preceding fields are parsed, and this does not affect the stream because identifier byte array is created from original stream. But either way, it is fine, in the sense that the identifier gets recreated as expected, and the rest of the stream is not affected. Marked as blocker as suggested. > Move handshake secret field from Token to BlockAccessToken > ---------------------------------------------------------- > > Key: HDFS-14611 > URL: https://issues.apache.org/jira/browse/HDFS-14611 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs > Reporter: Chen Liang > Assignee: Chen Liang > Priority: Major > Attachments: HDFS-14611.001.patch > > > Currently the handshake secret is included in Token, but conceptually this > should belong to Block Access Token only. More importantly, having this field > in Token could potentially break compatibility. Moreover, having this field > as part of Block Access Token also means we may not need to encrypt this > field anymore, because block access token is already encrypted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org