[ 
https://issues.apache.org/jira/browse/SPARK-21827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16838421#comment-16838421
 ] 

Sébastien BARNOUD commented on SPARK-21827:
-------------------------------------------

Hi,

We already use for a while and with huge volume SASL with Kafka. I just have a 
look on Kafka implementation:

[https://github.com/apache/kafka/blob/6ca899e56d451eef04e81b0f4d88bdb10f3cf4b3/clients/src/main/java/org/apache/kafka/common/network/Selector.java]

The KafkaChannel (including the SaslClient) is managed by this class that is 
clearly documented as NOT thread safe. That is probably the reason why we never 
noticed issue with Kafka and SASL.

 

> Task fail due to executor exception when enable Sasl Encryption
> ---------------------------------------------------------------
>
>                 Key: SPARK-21827
>                 URL: https://issues.apache.org/jira/browse/SPARK-21827
>             Project: Spark
>          Issue Type: Bug
>          Components: Shuffle, Spark Core
>    Affects Versions: 1.6.1, 2.1.1, 2.2.0
>         Environment: OS: RedHat 7.1 64bit
>            Reporter: Yishan Jiang
>            Priority: Major
>
> We met authentication and Sasl encryption on many versions, just append 161 
> version like this:
> spark.local.dir /tmp/test-161
> spark.shuffle.service.enabled true
> *spark.authenticate true*
> *spark.authenticate.enableSaslEncryption true*
> *spark.network.sasl.serverAlwaysEncrypt true*
> spark.authenticate.secret e25d4369-bec3-4266-8fc5-fb6d4fcee66f
> spark.history.ui.port 18089
> spark.shuffle.service.port 7347
> spark.master.rest.port 6076
> spark.deploy.recoveryMode NONE
> spark.ssl.enabled true
> spark.executor.extraJavaOptions -Djava.security.egd=file:/dev/./urandom
> We run an Spark example and task fail with Exception messages:
> 17/08/22 03:56:52 INFO BlockManager: external shuffle service port = 7347
> 17/08/22 03:56:52 INFO BlockManagerMaster: Trying to register BlockManager
> 17/08/22 03:56:52 INFO sasl: DIGEST41:Unmatched MACs
> 17/08/22 03:56:52 WARN TransportChannelHandler: Exception in connection from 
> cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394
> java.lang.IllegalArgumentException: Frame length should be positive: 
> -5594407078713290673       
>         at 
> org.spark-project.guava.base.Preconditions.checkArgument(Preconditions.java:119)
>         at 
> org.apache.spark.network.util.TransportFrameDecoder.decodeNext(TransportFrameDecoder.java:135)
>         at 
> org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:82)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
>         at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
>         at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>         at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>         at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>         at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>         at java.lang.Thread.run(Thread.java:785)
> 17/08/22 03:56:52 ERROR TransportResponseHandler: Still have 1 requests 
> outstanding when connection from 
> cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394 is closed
> 17/08/22 03:56:52 WARN NettyRpcEndpointRef: Error sending message [message = 
> RegisterBlockManager(BlockManagerId(fe9d31da-f70c-40a2-9032-05a5af4ba4c5, 
> cws58n1.ma.platformlab.ibm.com, 45852),2985295872,NettyRpcEn
> dpointRef(null))] in 1 attempts
> java.lang.IllegalArgumentException: Frame length should be positive: 
> -5594407078713290673
>         at 
> org.spark-project.guava.base.Preconditions.checkArgument(Preconditions.java:119)
>         at 
> org.apache.spark.network.util.TransportFrameDecoder.decodeNext(TransportFrameDecoder.java:135)
>         at 
> org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:82)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294)
>         at 
> io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846)
>         at 
> io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131)
>         at 
> io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>         at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>         at 
> io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>         at java.lang.Thread.run(Thread.java:785)
> 17/08/22 03:56:55 ERROR TransportClient: Failed to send RPC 
> 9091046580632843491 to cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394: 
> java.nio.channels.ClosedChannelException
> java.nio.channels.ClosedChannelException
> 17/08/22 03:56:55 WARN NettyRpcEndpointRef: Error sending message [message = 
> RegisterBlockManager(BlockManagerId(fe9d31da-f70c-40a2-9032-05a5af4ba4c5, 
> cws58n1.ma.platformlab.ibm.com, 45852),2985295872,NettyRpcEndpointRef(null))] 
> in 2 attempts
> java.io.IOException: Failed to send RPC 9091046580632843491 to 
> cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394: 
> java.nio.channels.ClosedChannelException
>         at 
> org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
>         at 
> org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
>         at 
> io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
>         at 
> io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:567)
>         at 
> io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
>         at 
> io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:801)
>         at 
> io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:699)
>         at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1122)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:633)
>         at 
> io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:32)
>         at 
> io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:908)
>         at 
> io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:960)
>         at 
> io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:893)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
>         at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
>         at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>         at java.lang.Thread.run(Thread.java:785)
> Caused by: java.nio.channels.ClosedChannelException



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to