[jira] [Updated] (HDFS-11026) Convert BlockTokenIdentifier to use Protobuf

2017-02-13 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-11026:
-
Description: {{BlockTokenIdentifier}} currently uses a 
{{DataInput}}/{{DataOutput}} (basically a {{byte[]}}) and manual serialization 
to get data into and out of the encrypted buffer (in {{BlockKeyProto}}). Other 
TokenIdentifiers (e.g. {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) 
use Protobuf. The {{BlockTokenIdenfitier}} should use Protobuf as well so it 
can be expanded more easily and will be consistent with the rest of the system. 
 (was: {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}} 
(basically a {{byte[]}}) and manual serialization to get data into and out of 
the encrypted buffer (in {{BlockKeyProto}}). Other TokenIdentifiers (e.g. 
{{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf. The 
{{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded more 
easily and will be consistent with the rest of the system.

NB: Release of this will require a version update since 2.8.x won't be able to 
decipher {{BlockKeyProto.keyBytes}} from 2.8.y.)

> Convert BlockTokenIdentifier to use Protobuf
> 
>
> Key: HDFS-11026
> URL: https://issues.apache.org/jira/browse/HDFS-11026
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs, hdfs-client
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Fix For: 3.0.0-alpha3
>
> Attachments: blocktokenidentifier-protobuf.patch, 
> HDFS-11026.002.patch, HDFS-11026.003.patch, HDFS-11026.004.patch, 
> HDFS-11026.005.patch, HDFS-11026.006.patch, HDFS-11026.007.patch
>
>
> {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}} 
> (basically a {{byte[]}}) and manual serialization to get data into and out of 
> the encrypted buffer (in {{BlockKeyProto}}). Other TokenIdentifiers (e.g. 
> {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf. The 
> {{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded 
> more easily and will be consistent with the rest of the system.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11026) Convert BlockTokenIdentifier to use Protobuf

2017-02-13 Thread Chris Douglas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated HDFS-11026:
-
  Resolution: Fixed
Hadoop Flags: Reviewed
Release Note: Changed the serialized format of BlockTokenIdentifier to 
protocol buffers. Includes logic to decode both the old Writable format and the 
new PB format to support existing clients. Client implementations in other 
languages will require similar functionality.
  Status: Resolved  (was: Patch Available)

+1 Thanks for the detail, Ewan. I committed this.

This isn't marked as an incompatible change because this isn't a public API and 
it's compatible, but libraries implementing the HDFS client protocol will need 
to be updated. Added that to the release note.

> Convert BlockTokenIdentifier to use Protobuf
> 
>
> Key: HDFS-11026
> URL: https://issues.apache.org/jira/browse/HDFS-11026
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs, hdfs-client
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Fix For: 3.0.0-alpha3
>
> Attachments: blocktokenidentifier-protobuf.patch, 
> HDFS-11026.002.patch, HDFS-11026.003.patch, HDFS-11026.004.patch, 
> HDFS-11026.005.patch, HDFS-11026.006.patch, HDFS-11026.007.patch
>
>
> {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}} 
> (basically a {{byte[]}}) and manual serialization to get data into and out of 
> the encrypted buffer (in {{BlockKeyProto}}). Other TokenIdentifiers (e.g. 
> {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf. The 
> {{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded 
> more easily and will be consistent with the rest of the system.
> NB: Release of this will require a version update since 2.8.x won't be able 
> to decipher {{BlockKeyProto.keyBytes}} from 2.8.y.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11026) Convert BlockTokenIdentifier to use Protobuf

2017-02-13 Thread Ewan Higgs (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-11026:
--
Attachment: HDFS-11026.007.patch

Attaching the aforementioned patch.

> Convert BlockTokenIdentifier to use Protobuf
> 
>
> Key: HDFS-11026
> URL: https://issues.apache.org/jira/browse/HDFS-11026
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs, hdfs-client
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Fix For: 3.0.0-alpha3
>
> Attachments: blocktokenidentifier-protobuf.patch, 
> HDFS-11026.002.patch, HDFS-11026.003.patch, HDFS-11026.004.patch, 
> HDFS-11026.005.patch, HDFS-11026.006.patch, HDFS-11026.007.patch
>
>
> {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}} 
> (basically a {{byte[]}}) and manual serialization to get data into and out of 
> the encrypted buffer (in {{BlockKeyProto}}). Other TokenIdentifiers (e.g. 
> {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf. The 
> {{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded 
> more easily and will be consistent with the rest of the system.
> NB: Release of this will require a version update since 2.8.x won't be able 
> to decipher {{BlockKeyProto.keyBytes}} from 2.8.y.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11026) Convert BlockTokenIdentifier to use Protobuf

2017-02-10 Thread Ewan Higgs (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-11026:
--
Attachment: HDFS-11026.006.patch

Attached a new patch which adds the documentation requested by [~chris.douglas] 
and added two more tests for empty BlockTokenIdentifiers.

> Convert BlockTokenIdentifier to use Protobuf
> 
>
> Key: HDFS-11026
> URL: https://issues.apache.org/jira/browse/HDFS-11026
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs, hdfs-client
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Ewan Higgs
>Assignee: Ewan Higgs
> Fix For: 3.0.0-alpha3
>
> Attachments: blocktokenidentifier-protobuf.patch, 
> HDFS-11026.002.patch, HDFS-11026.003.patch, HDFS-11026.004.patch, 
> HDFS-11026.005.patch, HDFS-11026.006.patch
>
>
> {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}} 
> (basically a {{byte[]}}) and manual serialization to get data into and out of 
> the encrypted buffer (in {{BlockKeyProto}}). Other TokenIdentifiers (e.g. 
> {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf. The 
> {{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded 
> more easily and will be consistent with the rest of the system.
> NB: Release of this will require a version update since 2.8.x won't be able 
> to decipher {{BlockKeyProto.keyBytes}} from 2.8.y.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11026) Convert BlockTokenIdentifier to use Protobuf

2017-02-01 Thread Ewan Higgs (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-11026:
--
Attachment: HDFS-11026.005.patch

Jenkins revealed an issue in the new tests I added: when attempting to parse a 
protobuf as though it's a legacy block token, OpenJDK will raise an IOException 
while Oracle JDK throws a RuntimeException.

I've updated the patch to catch both in the appropriate test.

> Convert BlockTokenIdentifier to use Protobuf
> 
>
> Key: HDFS-11026
> URL: https://issues.apache.org/jira/browse/HDFS-11026
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs, hdfs-client
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Ewan Higgs
> Fix For: 3.0.0-alpha3
>
> Attachments: blocktokenidentifier-protobuf.patch, 
> HDFS-11026.002.patch, HDFS-11026.003.patch, HDFS-11026.004.patch, 
> HDFS-11026.005.patch
>
>
> {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}} 
> (basically a {{byte[]}}) and manual serialization to get data into and out of 
> the encrypted buffer (in {{BlockKeyProto}}). Other TokenIdentifiers (e.g. 
> {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf. The 
> {{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded 
> more easily and will be consistent with the rest of the system.
> NB: Release of this will require a version update since 2.8.x won't be able 
> to decipher {{BlockKeyProto.keyBytes}} from 2.8.y.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11026) Convert BlockTokenIdentifier to use Protobuf

2017-01-25 Thread Andrew Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Wang updated HDFS-11026:
---
Fix Version/s: (was: 3.0.0-alpha2)
   3.0.0-alpha3

> Convert BlockTokenIdentifier to use Protobuf
> 
>
> Key: HDFS-11026
> URL: https://issues.apache.org/jira/browse/HDFS-11026
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs, hdfs-client
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Ewan Higgs
> Fix For: 3.0.0-alpha3
>
> Attachments: blocktokenidentifier-protobuf.patch, 
> HDFS-11026.002.patch, HDFS-11026.003.patch
>
>
> {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}} 
> (basically a {{byte[]}}) and manual serialization to get data into and out of 
> the encrypted buffer (in {{BlockKeyProto}}). Other TokenIdentifiers (e.g. 
> {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf. The 
> {{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded 
> more easily and will be consistent with the rest of the system.
> NB: Release of this will require a version update since 2.8.x won't be able 
> to decipher {{BlockKeyProto.keyBytes}} from 2.8.y.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11026) Convert BlockTokenIdentifier to use Protobuf

2016-11-10 Thread Ewan Higgs (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-11026:
--
Attachment: HDFS-11026.003.patch

Attaching HDFS-11026.003.patch which adds the default value for 
{{dfs.block.access.token.protobuf.enable}} to {{hdfs-defaults.xml}}. This fixes 
one of the test failures. The other looked spurious.

> Convert BlockTokenIdentifier to use Protobuf
> 
>
> Key: HDFS-11026
> URL: https://issues.apache.org/jira/browse/HDFS-11026
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs, hdfs-client
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Ewan Higgs
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-11026.002.patch, HDFS-11026.003.patch, 
> blocktokenidentifier-protobuf.patch
>
>
> {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}} 
> (basically a {{byte[]}}) and manual serialization to get data into and out of 
> the encrypted buffer (in {{BlockKeyProto}}). Other TokenIdentifiers (e.g. 
> {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf. The 
> {{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded 
> more easily and will be consistent with the rest of the system.
> NB: Release of this will require a version update since 2.8.x won't be able 
> to decipher {{BlockKeyProto.keyBytes}} from 2.8.y.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11026) Convert BlockTokenIdentifier to use Protobuf

2016-11-02 Thread Ewan Higgs (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-11026:
--
Attachment: HDFS-11026.002.patch

Attached is HDFS-11026.002.patch. This provides a configuration option, 
{{dfs.block.access.token.protobuf.enable}}, which optionally makes the 
{{BlockTokenIdentifier}} write using WritableUtils ("Legacy") when set to false 
or Protobuf when the option is set to true.

h4. How to use it
Admins can roll out the new datanodes or namenodes in any order. When all the 
servers are in place, stop the namenode. Set the configuration option to 
{{true}}. Restart the namenode.

h4. How to roll back
Take down the Namenode. Update the configuration option. Restart the namenode.

h4. How it works
When writing to the buffer, we use the value provided by the configuration to 
determine if legacy or protobuf is written. This sets the {{useProto}} flag in 
the {{BlockTokenIdentifier}} and is then used to determine how we write the 
buffer.

When reading, we peeks at the first byte and checks if it's less than or equal 
to 0 as [~chris.douglas] suggested \[1\]. If we discover that we are in a 
protobuf world, then we set the {{BlockTokenIdentifier.useProto}} flag to true. 
This is required because we often have a pattern:

{code}
BlockTokenIdentifier id = new BlockTokenIdentifier();
id.readFields(in);
...
{code}

Down the line, we may ask the {{BlockTokenSecretManager}} to create a password. 
This uses a {{BlockTokenIdentifier}} which has been built up as above and then 
calls {{getBytes}}. If we didn't acknowledge that we are in a protobuf world, 
then it would create a password using legacy WritableUtils.

h4. Tests
Also in the patch, I've updated the {{TestBlockToken}} tests to run in both 
Legacy and protobuf modes. 

I also manually tested it locally using the following configurations. I 
couldn't get a 3.0 and 2.8 datanode/namenode to talk to each other so I'm not 
sure if I'm doing something wrong of if this is a show stopper on the rolling 
upgrades (if so, I'll enter a ticket):

||NameNode version || Datanode version || config option || result||
| 3.0.0-alpha1-SNAPSHOT | 3.0.0-alpha1-SNAPSHOT| false | Works using legacy|
| 3.0.0-alpha1-SNAPSHOT | 3.0.0-alpha1-SNAPSHOT| true | Works using protobuf| 
| 2.8.0-SNAPSHOT | 3.0.0-alpha1-SNAPSHOT| false/true | Fails with 
IncorrectVersionException| 
 | 3.0.0-alpha1-SNAPSHOT|2.8.0-SNAPSHOT| false/true | Fails with 
IncorrectVersionException| 

\[1\] [~chris.douglas] actually suggested only checking that the value is 
negative but it's possible to create empty {{BlockTokenIdentifier}}s and write 
them. This results in a 0 in the first byte. There is an explicit test in 
{{TestBlockToken.testWritable}} so I decided to keep the behaviour.

> Convert BlockTokenIdentifier to use Protobuf
> 
>
> Key: HDFS-11026
> URL: https://issues.apache.org/jira/browse/HDFS-11026
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs, hdfs-client
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Ewan Higgs
> Fix For: 3.0.0-alpha2
>
> Attachments: HDFS-11026.002.patch, blocktokenidentifier-protobuf.patch
>
>
> {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}} 
> (basically a {{byte[]}}) and manual serialization to get data into and out of 
> the encrypted buffer (in {{BlockKeyProto}}). Other TokenIdentifiers (e.g. 
> {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf. The 
> {{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded 
> more easily and will be consistent with the rest of the system.
> NB: Release of this will require a version update since 2.8.x won't be able 
> to decipher {{BlockKeyProto.keyBytes}} from 2.8.y.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11026) Convert BlockTokenIdentifier to use Protobuf

2016-10-19 Thread Ewan Higgs (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-11026:
--
Attachment: blocktokenidentifier-protobuf.patch

Attaching a patch that converts the {{BlockTokenIdentifier}} to use Protobuf. 

NB: This uses {{optional}} values for all the items being written to the token 
secret since that's the behaviour of the previous version that used 
{{WritableUtils.writeString}}. I think it would be better to force the Strings 
to be non-null when writing but I chose to do it this way so it's like-for-like 
with the previous version.

> Convert BlockTokenIdentifier to use Protobuf
> 
>
> Key: HDFS-11026
> URL: https://issues.apache.org/jira/browse/HDFS-11026
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs, hdfs-client
>Affects Versions: 2.9.0, 3.0.0-alpha1
>Reporter: Ewan Higgs
> Fix For: 3.0.0-alpha2
>
> Attachments: blocktokenidentifier-protobuf.patch
>
>
> {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}} 
> (basically a {{byte[]}}) and manual serialization to get data into and out of 
> the encrypted buffer (in {{BlockKeyProto}}). Other TokenIdentifiers (e.g. 
> {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf. The 
> {{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded 
> more easily and will be consistent with the rest of the system.
> NB: Release of this will require a version update since 2.8.x won't be able 
> to decipher {{BlockKeyProto.keyBytes}} from 2.8.y.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11026) Convert BlockTokenIdentifier to use Protobuf

2016-10-19 Thread Ewan Higgs (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewan Higgs updated HDFS-11026:
--
Fix Version/s: 3.0.0-alpha2
   Status: Patch Available  (was: Open)

> Convert BlockTokenIdentifier to use Protobuf
> 
>
> Key: HDFS-11026
> URL: https://issues.apache.org/jira/browse/HDFS-11026
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs, hdfs-client
>Affects Versions: 3.0.0-alpha1, 2.9.0
>Reporter: Ewan Higgs
> Fix For: 3.0.0-alpha2
>
> Attachments: blocktokenidentifier-protobuf.patch
>
>
> {{BlockTokenIdentifier}} currently uses a {{DataInput}}/{{DataOutput}} 
> (basically a {{byte[]}}) and manual serialization to get data into and out of 
> the encrypted buffer (in {{BlockKeyProto}}). Other TokenIdentifiers (e.g. 
> {{ContainerTokenIdentifier}}, {{AMRMTokenIdentifier}}) use Protobuf. The 
> {{BlockTokenIdenfitier}} should use Protobuf as well so it can be expanded 
> more easily and will be consistent with the rest of the system.
> NB: Release of this will require a version update since 2.8.x won't be able 
> to decipher {{BlockKeyProto.keyBytes}} from 2.8.y.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org