Re: How do I validate Data Encryption on Block data transfer?

2020-02-11 Thread Antonio A. Rendina

That parameter is related to data encryption on RPC when a client connects.

I never checked, but if you use it without kerberos you should have 
encryption without authentication. So, for example, if you do:


hdfs dfs -touch /tmp/myfile

the file will be owned by the user that is running the hdfs command, the 
user must exists on all machines.


The only difference compared with a non-secure configuration will be 
that the data between client and server will be encrypted using 
symmetric encryption.


I'm not security expert, but I think (*I'm theorizing*) that in this 
case there is a security risk related to the handshake in the initial 
connection, so you should run the command in this way only from a 
trusted host.


--

Antonio

Il 10/02/20 23:52, Daniel Howard ha scritto:
On Wed, Feb 5, 2020 at 11:29 PM Antonio A. Rendina 
mailto:arendin...@gmail.com>> wrote:


I never configured the access token, so I don't know how it works,
but I think that you should also set:

hadoop.rpc.protection=privacy

Question: can one use *hadoop.rpc.protection=privacy* /without/ Kerberos?

In theory, SASL can be used independently of Kerberos, but I haven't 
found an example for doing this with Hadoop, yet.


As for my original question ... after I enabled encrypted data 
transfer for block data transfer I did some filesystem benchmarks and 
there was plenty of performance impact to let me know that /something/ 
was afoot. :)


-danny

--
http://dannyman.toldme.com


Re: How do I validate Data Encryption on Block data transfer?

2020-02-10 Thread Daniel Howard
On Wed, Feb 5, 2020 at 11:29 PM Antonio A. Rendina 
wrote:

> I never configured the access token, so I don't know how it works, but I
> think that you should also set:
>
> hadoop.rpc.protection=privacy
>
Question: can one use *hadoop.rpc.protection=privacy* *without* Kerberos?

In theory, SASL can be used independently of Kerberos, but I haven't found
an example for doing this with Hadoop, yet.

As for my original question ... after I enabled encrypted data transfer for
block data transfer I did some filesystem benchmarks and there was plenty
of performance impact to let me know that *something* was afoot. :)

-danny

-- 
http://dannyman.toldme.com


Re: How do I validate Data Encryption on Block data transfer?

2020-02-05 Thread Antonio A. Rendina
I never configured the access token, so I don't know how it works, but I 
think that you should also set:


hadoop.rpc.protection=privacy

Regards

Il 05/02/20 19:01, Wei-Chiu Chuang ha scritto:
I don't know the answer to the question off the top of my head. 
Tracking the source code, it looks like the data transfer encryption 
does not really depend on Kerberos.


That said,
(1) the Hadoop data transfer encryption relies on the data encryption 
key distributed by the NameNode. If a client can't validate 
authenticity of the NameNode, it may not make much sense to encrypt.
(2) (With my Cloudera's hat on) If you use CDH, CM warns you that 
encryption is not effective if kerberos is not on, and which means 
this configuration is unsupported by Cloudera.


You can use packet sniffer to validate the encryption.

On Tue, Feb 4, 2020 at 3:44 PM Daniel Howard > wrote:


Hello,

My scenario is running Hadoop in an environment without multiple
users in a secure datacenter. Nevertheless, we prefer to have
encrypted data transfers for activity between nodes. We have
determined that we do not need to set up Kerberos, so I am working
through getting encryption going on block data transfer and web
services.

I appear to have DFS encryption enabled thanks to the following
settings in *hdfs-site.xml*:

  
    dfs.encrypt.data.transfer
    true
  
  
    dfs.block.access.token.enable
    true
  

Indeed, I was getting handshake errors on the datanodes with
dfs.encrypt.data.transferenabled until I also set
dfs.block.access.token.enable.

Filesystem operations work great now, but I still see plenty of this:

2020-02-04 15:25:59,492 INFO sasl.SaslDataTransferClient: SASL
encryption trust check: localHostTrusted = false,
remoteHostTrusted = false
2020-02-04 15:25:59,862 INFO sasl.SaslDataTransferClient: SASL
encryption trust check: localHostTrusted = false,
remoteHostTrusted = false
2020-02-04 15:26:00,054 INFO sasl.SaslDataTransferClient: SASL
encryption trust check: localHostTrusted = false,
remoteHostTrusted = false

I reckon that SASL is a Kerberos feature that I shouldn't ever
expect to see reported as true. Does that sound right?

Is there a way to verify that DFS is encrypting data between
nodes? (I could get a sniffer out...)

Thanks,
-danny

-- 
http://dannyman.toldme.com




Re: How do I validate Data Encryption on Block data transfer?

2020-02-05 Thread Wei-Chiu Chuang
I don't know the answer to the question off the top of my head. Tracking
the source code, it looks like the data transfer encryption does not really
depend on Kerberos.

That said,
(1) the Hadoop data transfer encryption relies on the data encryption key
distributed by the NameNode. If a client can't validate authenticity of the
NameNode, it may not make much sense to encrypt.
(2) (With my Cloudera's hat on) If you use CDH, CM warns you that
encryption is not effective if kerberos is not on, and which means this
configuration is unsupported by Cloudera.

You can use packet sniffer to validate the encryption.

On Tue, Feb 4, 2020 at 3:44 PM Daniel Howard  wrote:

> Hello,
>
> My scenario is running Hadoop in an environment without multiple users in
> a secure datacenter. Nevertheless, we prefer to have encrypted data
> transfers for activity between nodes. We have determined that we do not
> need to set up Kerberos, so I am working through getting encryption going
> on block data transfer and web services.
>
> I appear to have DFS encryption enabled thanks to the following settings
> in *hdfs-site.xml*:
> 
>   
> dfs.encrypt.data.transfer
> true
>   
>   
> dfs.block.access.token.enable
> true
>   
>
> Indeed, I was getting handshake errors on the datanodes with
> dfs.encrypt.data.transfer enabled until I also set
> dfs.block.access.token.enable.
>
> Filesystem operations work great now, but I still see plenty of this:
>
> 2020-02-04 15:25:59,492 INFO sasl.SaslDataTransferClient: SASL encryption
> trust check: localHostTrusted = false, remoteHostTrusted = false
> 2020-02-04 15:25:59,862 INFO sasl.SaslDataTransferClient: SASL encryption
> trust check: localHostTrusted = false, remoteHostTrusted = false
> 2020-02-04 15:26:00,054 INFO sasl.SaslDataTransferClient: SASL encryption
> trust check: localHostTrusted = false, remoteHostTrusted = false
>
> I reckon that SASL is a Kerberos feature that I shouldn't ever expect to
> see reported as true. Does that sound right?
>
> Is there a way to verify that DFS is encrypting data between nodes? (I
> could get a sniffer out...)
>
> Thanks,
> -danny
>
> --
> http://dannyman.toldme.com
>