Hi All, We are trying to enable RPC encryption between driver and executor. Currently we're working on Spark 2.4 on Kubernetes.
According to Apache Spark Security document (https://spark.apache.org/docs/latest/security.html) and our understanding on the same, it is clear that Spark supports AES-based encryption for RPC connections. There is also support for SASL-based encryption, although it should be considered deprecated. spark.network.crypto.enabled true , will enable AES-based RPC encryption. However, when we enable AES based encryption between driver and executor, we could observe a very sporadic behaviour in communication between driver and executor in the logs. Follwing are the options and their default values, we used for enabling encryption:- spark.authenticate true spark.authenticate.secret <some-value> spark.network.crypto.enabled true spark.network.crypto.keyLength 256 spark.network.crypto.saslFallback false A snippet of the executor log is provided below:- Exception in thread "main" 19/02/26 07:27:08 ERROR RpcOutboxMessage: Ask timeout before connecting successfully Caused by: java.util.concurrent.TimeoutException: Cannot receive any reply from sts-spark-thrift-server-1551165767426-driver-svc.default.svc:7078 in 120 seconds But, there is no error message or any message from executor seen in the driver log for the same timestamp. We also tried increasing spark.network.timeout, but no luck. This issue is seen sporadically, as the following observations were noted:- 1) Sometimes, enabling AES encryption works completely fine. 2) Sometimes, enabling AES encryption works fine for around 10 consecutive spark-submits but next trigger of spark-submit would go into hang state with the above mentioned error in the executor log. 3) Also, there are times, when enabling AES encryption would not work at all, as it would keep on spawnning more than 50 executors where the executors fail with the above mentioned error. Even, setting spark.network.crypto.saslFallback to true didn't help. Things are working fine when we enable SASL encryption, that is, only setting the following parameters:- spark.authenticate true spark.authenticate.secret <some-value> I have attached the log file containing detailed error message. Please let us know if any configuration is missing or if any one has faced the same issue. Any leads would be highly appreciated!! Kind Regards, Breeta Sinha
rpc_timeout_error.log
Description: rpc_timeout_error.log
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org