[jira] [Commented] (SPARK-21827) Task fail due to executor exception when enable Sasl Encryption
[ https://issues.apache.org/jira/browse/SPARK-21827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16279779#comment-16279779 ] Yishan Jiang commented on SPARK-21827: -- Yes, I am using HDFS. Cores to executor, mostly using default. Try other number like 2, 3... etc, same issue. > Task fail due to executor exception when enable Sasl Encryption > --- > > Key: SPARK-21827 > URL: https://issues.apache.org/jira/browse/SPARK-21827 > Project: Spark > Issue Type: Bug > Components: Shuffle, Spark Core >Affects Versions: 1.6.1, 2.1.1, 2.2.0 > Environment: OS: RedHat 7.1 64bit >Reporter: Yishan Jiang > > We met authentication and Sasl encryption on many versions, just append 161 > version like this: > spark.local.dir /tmp/test-161 > spark.shuffle.service.enabled true > *spark.authenticate true* > *spark.authenticate.enableSaslEncryption true* > *spark.network.sasl.serverAlwaysEncrypt true* > spark.authenticate.secret e25d4369-bec3-4266-8fc5-fb6d4fcee66f > spark.history.ui.port 18089 > spark.shuffle.service.port 7347 > spark.master.rest.port 6076 > spark.deploy.recoveryMode NONE > spark.ssl.enabled true > spark.executor.extraJavaOptions -Djava.security.egd=file:/dev/./urandom > We run an Spark example and task fail with Exception messages: > 17/08/22 03:56:52 INFO BlockManager: external shuffle service port = 7347 > 17/08/22 03:56:52 INFO BlockManagerMaster: Trying to register BlockManager > 17/08/22 03:56:52 INFO sasl: DIGEST41:Unmatched MACs > 17/08/22 03:56:52 WARN TransportChannelHandler: Exception in connection from > cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394 > java.lang.IllegalArgumentException: Frame length should be positive: > -5594407078713290673 > at > org.spark-project.guava.base.Preconditions.checkArgument(Preconditions.java:119) > at > org.apache.spark.network.util.TransportFrameDecoder.decodeNext(TransportFrameDecoder.java:135) > at > org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:82) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > at java.lang.Thread.run(Thread.java:785) > 17/08/22 03:56:52 ERROR TransportResponseHandler: Still have 1 requests > outstanding when connection from > cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394 is closed > 17/08/22 03:56:52 WARN NettyRpcEndpointRef: Error sending message [message = > RegisterBlockManager(BlockManagerId(fe9d31da-f70c-40a2-9032-05a5af4ba4c5, > cws58n1.ma.platformlab.ibm.com, 45852),2985295872,NettyRpcEn > dpointRef(null))] in 1 attempts > java.lang.IllegalArgumentException: Frame length should be positive: > -5594407078713290673 > at > org.spark-project.guava.base.Preconditions.checkArgument(Preconditions.java:119) > at > org.apache.spark.network.util.TransportFrameDecoder.decodeNext(TransportFrameDecoder.java:135) > at > org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:82) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > at
[jira] [Commented] (SPARK-21495) DIGEST-MD5: Out of order sequencing of messages from server
[ https://issues.apache.org/jira/browse/SPARK-21495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139834#comment-16139834 ] Yishan Jiang commented on SPARK-21495: -- Hi Sean, I saw you resolve this issue as "Not a problem", so does this mean spark only support simple secret like "aaa"? > DIGEST-MD5: Out of order sequencing of messages from server > --- > > Key: SPARK-21495 > URL: https://issues.apache.org/jira/browse/SPARK-21495 > Project: Spark > Issue Type: Bug > Components: Shuffle, Spark Core >Affects Versions: 1.6.1 > Environment: OS: RedHat 7.1 64bit > Spark: 1.6.1 >Reporter: Xin Yu Pan > > We hit an issue when enabling authentication and Sasl encryption, see bold > font in following parameter list. > spark.local.dir /tmp/xpan-spark-161 > spark.eventLog.dir file:///home/xpan/spark-conf/event > spark.eventLog.enabled true > spark.history.fs.logDirectory file:/home/xpan/spark-conf/event > spark.history.ui.port 18085 > spark.history.fs.cleaner.enabled true > spark.history.fs.cleaner.interval 1d > spark.history.fs.cleaner.maxAge 14d > spark.dynamicAllocation.enabled false > spark.shuffle.service.enabled false > spark.shuffle.service.port 7448 > spark.shuffle.reduceLocality.enabled false > spark.master.port 7087 > spark.master.rest.port 6077 > spark.executor.extraJavaOptions -Djava.security.egd=file:/dev/./urandom > *spark.authenticate true > spark.authenticate.secret 5828d44b-f9b9-4033-b1f5-21d1e3273ec2 > spark.authenticate.enableSaslEncryption false > spark.network.sasl.serverAlwaysEncrypt false* > We run the simple SparkPi example and there are Exception messages even > though the application gets done. > # cat > spark-1.6.1-bin-hadoop2.6/logs/spark-xpan-org.apache.spark.deploy.ExternalShuffleService-1-cws-75.out.1 > ... ... > 17/07/20 02:57:30 INFO spark.SecurityManager: SecurityManager: authentication > enabled; ui acls disabled; users with view permissions: Set(xpan); users with > modify permissions: Set(xpan) > 17/07/20 02:57:31 INFO deploy.ExternalShuffleService: Starting shuffle > service on port 7448 with useSasl = true > 17/07/20 02:58:04 INFO shuffle.ExternalShuffleBlockResolver: Registered > executor AppExecId{appId=app-20170720025800-, execId=0} with > ExecutorShuffleInfo{localDirs=[/tmp/xpan-spark-161/spark-8e4885a3-c463-4dfb-a396-04e16b65fd1e/executor-be15fcd0-c946-4c83-ba25-3b20bbce5b0e/blockmgr-0fd2658a-ce15-4d56-901c-4c746161bbe0], > subDirsPerLocalDir=64, > shuffleManager=org.apache.spark.shuffle.sort.SortShuffleManager} > 17/07/20 02:58:11 INFO security.sasl: DIGEST41:Unmatched MACs > 17/07/20 02:58:11 WARN server.TransportChannelHandler: Exception in > connection from /172.29.10.77:50616 > io.netty.handler.codec.DecoderException: javax.security.sasl.SaslException: > DIGEST-MD5: Out of order sequencing of messages from server. Got: 125 > Expected: 123 > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:99) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) > at > org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:86) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > at java.lang.Thread.run(Thread.java:785) > Caused by: javax.security.sasl.SaslException: DIGEST-MD5: Out of order > sequencing of messages from server. Got: 125 Expected: 123 > at > com.ibm.security.sasl.digest.DigestMD5Base$DigestPrivacy.unwrap(DigestMD5Base.java:1535) > at > com.ibm.security.sasl.digest.DigestMD5Base.unwrap(DigestMD5Base.java:231) > at > org.apache.spark.network.sasl.SparkSaslServer.unwrap(SparkSaslServer.java:149) > at > org.apache.spark.network.sasl.SaslEncryption$DecryptionHandler.decode(SaslEncryption.java:127) >
[jira] [Updated] (SPARK-21827) Task fail due to executor exception when enable Sasl Encryption
[ https://issues.apache.org/jira/browse/SPARK-21827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yishan Jiang updated SPARK-21827: - Description: We met authentication and Sasl encryption on many versions, just append 161 version like this: spark.local.dir /tmp/test-161 spark.shuffle.service.enabled true *spark.authenticate true* *spark.authenticate.enableSaslEncryption true* *spark.network.sasl.serverAlwaysEncrypt true* spark.authenticate.secret e25d4369-bec3-4266-8fc5-fb6d4fcee66f spark.history.ui.port 18089 spark.shuffle.service.port 7347 spark.master.rest.port 6076 spark.deploy.recoveryMode NONE spark.ssl.enabled true spark.executor.extraJavaOptions -Djava.security.egd=file:/dev/./urandom We run an Spark example and task fail with Exception messages: 17/08/22 03:56:52 INFO BlockManager: external shuffle service port = 7347 17/08/22 03:56:52 INFO BlockManagerMaster: Trying to register BlockManager 17/08/22 03:56:52 INFO sasl: DIGEST41:Unmatched MACs 17/08/22 03:56:52 WARN TransportChannelHandler: Exception in connection from cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394 java.lang.IllegalArgumentException: Frame length should be positive: -5594407078713290673 at org.spark-project.guava.base.Preconditions.checkArgument(Preconditions.java:119) at org.apache.spark.network.util.TransportFrameDecoder.decodeNext(TransportFrameDecoder.java:135) at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:82) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:785) 17/08/22 03:56:52 ERROR TransportResponseHandler: Still have 1 requests outstanding when connection from cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394 is closed 17/08/22 03:56:52 WARN NettyRpcEndpointRef: Error sending message [message = RegisterBlockManager(BlockManagerId(fe9d31da-f70c-40a2-9032-05a5af4ba4c5, cws58n1.ma.platformlab.ibm.com, 45852),2985295872,NettyRpcEn dpointRef(null))] in 1 attempts java.lang.IllegalArgumentException: Frame length should be positive: -5594407078713290673 at org.spark-project.guava.base.Preconditions.checkArgument(Preconditions.java:119) at org.apache.spark.network.util.TransportFrameDecoder.decodeNext(TransportFrameDecoder.java:135) at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:82) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:785) 17/08/22 03:56:55 ERROR TransportClient: Failed to send RPC 9091046580632843491 to cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394: java.nio.channels.ClosedChannelException java.nio.channels.ClosedChannelException 17/08/22 03:56:55 WARN NettyRpcEndpointRef: Error sending message [message = RegisterBlockManager(BlockManagerId(fe9d31da-f70c-40a2-9032-05a5af4ba4c5, cws58n1.ma.platformlab.ibm.com, 45852),2985295872,NettyRpcEndpointRef(null))] in 2 attempts java.io.IOException: Failed to send RPC 9091046580632843491 to cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394: java.nio.channels.ClosedChannelException at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
[jira] [Created] (SPARK-21827) Task fail due to executor exception when enable Sasl Encryption
Yishan Jiang created SPARK-21827: Summary: Task fail due to executor exception when enable Sasl Encryption Key: SPARK-21827 URL: https://issues.apache.org/jira/browse/SPARK-21827 Project: Spark Issue Type: Bug Components: Shuffle, Spark Core Affects Versions: 2.2.0, 2.1.1, 1.6.1 Environment: linux x86_64 Reporter: Yishan Jiang We met authentication and Sasl encryption on many versions, just append 161 version like this: spark.local.dir /tmp/test-161 spark.shuffle.service.enabled true *spark.authenticate true* *spark.authenticate.enableSaslEncryption true* *spark.network.sasl.serverAlwaysEncrypt true* spark.history.ui.port 18089 spark.shuffle.service.port 7347 spark.master.rest.port 6076 spark.deploy.recoveryMode NONE spark.ssl.enabled true spark.executor.extraJavaOptions -Djava.security.egd=file:/dev/./urandom We run an Spark example and task fail with Exception messages: 17/08/22 03:56:52 INFO BlockManager: external shuffle service port = 7347 17/08/22 03:56:52 INFO BlockManagerMaster: Trying to register BlockManager 17/08/22 03:56:52 INFO sasl: DIGEST41:Unmatched MACs 17/08/22 03:56:52 WARN TransportChannelHandler: Exception in connection from cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394 java.lang.IllegalArgumentException: Frame length should be positive: -5594407078713290673 at org.spark-project.guava.base.Preconditions.checkArgument(Preconditions.java:119) at org.apache.spark.network.util.TransportFrameDecoder.decodeNext(TransportFrameDecoder.java:135) at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:82) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:785) 17/08/22 03:56:52 ERROR TransportResponseHandler: Still have 1 requests outstanding when connection from cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394 is closed 17/08/22 03:56:52 WARN NettyRpcEndpointRef: Error sending message [message = RegisterBlockManager(BlockManagerId(fe9d31da-f70c-40a2-9032-05a5af4ba4c5, cws58n1.ma.platformlab.ibm.com, 45852),2985295872,NettyRpcEn dpointRef(null))] in 1 attempts java.lang.IllegalArgumentException: Frame length should be positive: -5594407078713290673 at org.spark-project.guava.base.Preconditions.checkArgument(Preconditions.java:119) at org.apache.spark.network.util.TransportFrameDecoder.decodeNext(TransportFrameDecoder.java:135) at org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:82) at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) at java.lang.Thread.run(Thread.java:785) 17/08/22 03:56:55 ERROR TransportClient: Failed to send RPC 9091046580632843491 to cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394: java.nio.channels.ClosedChannelException java.nio.channels.ClosedChannelException 17/08/22 03:56:55 WARN NettyRpcEndpointRef: Error sending message [message = RegisterBlockManager(BlockManagerId(fe9d31da-f70c-40a2-9032-05a5af4ba4c5, cws58n1.ma.platformlab.ibm.com, 45852),2985295872,NettyRpcEndpointRef(null))] in 2 attempts java.io.IOException: Failed to send RPC 9091046580632843491 to
[jira] [Updated] (SPARK-21827) Task fail due to executor exception when enable Sasl Encryption
[ https://issues.apache.org/jira/browse/SPARK-21827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yishan Jiang updated SPARK-21827: - Environment: OS: RedHat 7.1 64bit was: linux x86_64 > Task fail due to executor exception when enable Sasl Encryption > --- > > Key: SPARK-21827 > URL: https://issues.apache.org/jira/browse/SPARK-21827 > Project: Spark > Issue Type: Bug > Components: Shuffle, Spark Core >Affects Versions: 1.6.1, 2.1.1, 2.2.0 > Environment: OS: RedHat 7.1 64bit >Reporter: Yishan Jiang > > We met authentication and Sasl encryption on many versions, just append 161 > version like this: > spark.local.dir /tmp/test-161 > spark.shuffle.service.enabled true > *spark.authenticate true* > *spark.authenticate.enableSaslEncryption true* > *spark.network.sasl.serverAlwaysEncrypt true* > spark.history.ui.port 18089 > spark.shuffle.service.port 7347 > spark.master.rest.port 6076 > spark.deploy.recoveryMode NONE > spark.ssl.enabled true > spark.executor.extraJavaOptions -Djava.security.egd=file:/dev/./urandom > We run an Spark example and task fail with Exception messages: > 17/08/22 03:56:52 INFO BlockManager: external shuffle service port = 7347 > 17/08/22 03:56:52 INFO BlockManagerMaster: Trying to register BlockManager > 17/08/22 03:56:52 INFO sasl: DIGEST41:Unmatched MACs > 17/08/22 03:56:52 WARN TransportChannelHandler: Exception in connection from > cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394 > java.lang.IllegalArgumentException: Frame length should be positive: > -5594407078713290673 > at > org.spark-project.guava.base.Preconditions.checkArgument(Preconditions.java:119) > at > org.apache.spark.network.util.TransportFrameDecoder.decodeNext(TransportFrameDecoder.java:135) > at > org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:82) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > at java.lang.Thread.run(Thread.java:785) > 17/08/22 03:56:52 ERROR TransportResponseHandler: Still have 1 requests > outstanding when connection from > cws57n6.ma.platformlab.ibm.com/172.29.8.66:49394 is closed > 17/08/22 03:56:52 WARN NettyRpcEndpointRef: Error sending message [message = > RegisterBlockManager(BlockManagerId(fe9d31da-f70c-40a2-9032-05a5af4ba4c5, > cws58n1.ma.platformlab.ibm.com, 45852),2985295872,NettyRpcEn > dpointRef(null))] in 1 attempts > java.lang.IllegalArgumentException: Frame length should be positive: > -5594407078713290673 > at > org.spark-project.guava.base.Preconditions.checkArgument(Preconditions.java:119) > at > org.apache.spark.network.util.TransportFrameDecoder.decodeNext(TransportFrameDecoder.java:135) > at > org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:82) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > at java.lang.Thread.run(Thread.java:785) > 17/08/22 03:56:55 ERROR TransportClient: Failed to send RPC > 9091046580632843491 to
[jira] [Commented] (SPARK-21495) DIGEST-MD5: Out of order sequencing of messages from server
[ https://issues.apache.org/jira/browse/SPARK-21495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16139813#comment-16139813 ] Yishan Jiang commented on SPARK-21495: -- Meet same issue like this, I tried change spark.authenticate.secret as simple as "aaa" and it works well. So it mostly because the authenticate could not support complicate secret. Change it to simple secret to work around if it blocks. > DIGEST-MD5: Out of order sequencing of messages from server > --- > > Key: SPARK-21495 > URL: https://issues.apache.org/jira/browse/SPARK-21495 > Project: Spark > Issue Type: Bug > Components: Shuffle, Spark Core >Affects Versions: 1.6.1 > Environment: OS: RedHat 7.1 64bit > Spark: 1.6.1 >Reporter: Xin Yu Pan > > We hit an issue when enabling authentication and Sasl encryption, see bold > font in following parameter list. > spark.local.dir /tmp/xpan-spark-161 > spark.eventLog.dir file:///home/xpan/spark-conf/event > spark.eventLog.enabled true > spark.history.fs.logDirectory file:/home/xpan/spark-conf/event > spark.history.ui.port 18085 > spark.history.fs.cleaner.enabled true > spark.history.fs.cleaner.interval 1d > spark.history.fs.cleaner.maxAge 14d > spark.dynamicAllocation.enabled false > spark.shuffle.service.enabled false > spark.shuffle.service.port 7448 > spark.shuffle.reduceLocality.enabled false > spark.master.port 7087 > spark.master.rest.port 6077 > spark.executor.extraJavaOptions -Djava.security.egd=file:/dev/./urandom > *spark.authenticate true > spark.authenticate.secret 5828d44b-f9b9-4033-b1f5-21d1e3273ec2 > spark.authenticate.enableSaslEncryption false > spark.network.sasl.serverAlwaysEncrypt false* > We run the simple SparkPi example and there are Exception messages even > though the application gets done. > # cat > spark-1.6.1-bin-hadoop2.6/logs/spark-xpan-org.apache.spark.deploy.ExternalShuffleService-1-cws-75.out.1 > ... ... > 17/07/20 02:57:30 INFO spark.SecurityManager: SecurityManager: authentication > enabled; ui acls disabled; users with view permissions: Set(xpan); users with > modify permissions: Set(xpan) > 17/07/20 02:57:31 INFO deploy.ExternalShuffleService: Starting shuffle > service on port 7448 with useSasl = true > 17/07/20 02:58:04 INFO shuffle.ExternalShuffleBlockResolver: Registered > executor AppExecId{appId=app-20170720025800-, execId=0} with > ExecutorShuffleInfo{localDirs=[/tmp/xpan-spark-161/spark-8e4885a3-c463-4dfb-a396-04e16b65fd1e/executor-be15fcd0-c946-4c83-ba25-3b20bbce5b0e/blockmgr-0fd2658a-ce15-4d56-901c-4c746161bbe0], > subDirsPerLocalDir=64, > shuffleManager=org.apache.spark.shuffle.sort.SortShuffleManager} > 17/07/20 02:58:11 INFO security.sasl: DIGEST41:Unmatched MACs > 17/07/20 02:58:11 WARN server.TransportChannelHandler: Exception in > connection from /172.29.10.77:50616 > io.netty.handler.codec.DecoderException: javax.security.sasl.SaslException: > DIGEST-MD5: Out of order sequencing of messages from server. Got: 125 > Expected: 123 > at > io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:99) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) > at > org.apache.spark.network.util.TransportFrameDecoder.channelRead(TransportFrameDecoder.java:86) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:308) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:294) > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:846) > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:131) > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468) > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111) > at java.lang.Thread.run(Thread.java:785) > Caused by: javax.security.sasl.SaslException: DIGEST-MD5: Out of order > sequencing of messages from server. Got: 125 Expected: 123 > at > com.ibm.security.sasl.digest.DigestMD5Base$DigestPrivacy.unwrap(DigestMD5Base.java:1535) > at > com.ibm.security.sasl.digest.DigestMD5Base.unwrap(DigestMD5Base.java:231) > at >