Hi All, I am running a spark program on secured cluster which creates SqlContext for creating dataframe over phoenix table.
When I run my program in local mode with --master option set to local[2] my program works completely fine, however when I try to run same program with master option set to yarn-client, I am getting below exception: Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=5, exceptions: Fri Sep 16 12:14:10 IST 2016, RpcRetryingCaller{globalStartTime=1474008247898, pause=100, retries=5}, org.apache.hadoop.hbase.MasterNotRunningException: com.google.protobuf.ServiceException: java.io.IOException: Could not set up IO Streams to demo-qa2-nn/10.60.2.15:16000 at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:147) at org.apache.hadoop.hbase.client.HBaseAdmin.executeCallable(HBaseAdmin.java:4083) at org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor(HBaseAdmin.java:528) at org.apache.hadoop.hbase.client.HBaseAdmin.getTableDescriptor(HBaseAdmin.java:550) at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:810) ... 50 more Caused by: org.apache.hadoop.hbase.MasterNotRunningException: com.google.protobuf.ServiceException: java.io.IOException: Could not set up IO Streams to demo-qa2-nn/10.60.2.15:16000 at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1540) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.makeStub(ConnectionManager.java:1560) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveMasterService(ConnectionManager.java:1711) at org.apache.hadoop.hbase.client.MasterCallable.prepare(MasterCallable.java:38) at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:124) ... 54 more Caused by: com.google.protobuf.ServiceException: java.io.IOException: Could not set up IO Streams to demo-qa2-nn/10.60.2.15:16000 at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:223) at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:287) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$BlockingStub.isMasterRunning(MasterProtos.java:58152) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$MasterServiceStubMaker.isMasterRunning(ConnectionManager.java:1571) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStubNoRetries(ConnectionManager.java:1509) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation$StubMaker.makeStub(ConnectionManager.java:1531) ... 58 more Caused by: java.io.IOException: Could not set up IO Streams to demo-qa2-nn/10.60.2.15:16000 at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:779) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.writeRequest(RpcClientImpl.java:887) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.tracedWriteRequest(RpcClientImpl.java:856) at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1200) at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:213) ... 63 more Caused by: java.lang.RuntimeException: SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'. at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$1.run(RpcClientImpl.java:679) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.handleSaslConnectionFailure(RpcClientImpl.java:637) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:745) ... 67 more Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) at org.apache.hadoop.hbase.security.HBaseSaslRpcClient.saslConnect(HBaseSaslRpcClient.java:179) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupSaslConnection(RpcClientImpl.java:611) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.access$600(RpcClientImpl.java:156) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:737) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection$2.run(RpcClientImpl.java:734) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) at org.apache.hadoop.hbase.ipc.RpcClientImpl$Connection.setupIOstreams(RpcClientImpl.java:734) ... 67 more Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) at sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121) at sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) at sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193) ... 76 more PFB program and command I am using: val sparkConf = new SparkConf().setAppName(appName) .set("spark.kyro.registrationRequired", "true") //always use kyro CustomKryoRegistrator.register(sparkConf) val sc=new SparkContext(sparkConf) val sqlContext = new org.apache.spark.sql.SQLContext(sc) sqlContext.setConf("spark.sql.parquet.binaryAsString", "true") val df = sqlContext.read.format("org.apache.phoenix.spark") .option("table", table_name) .option("zkUrl", "demo-qa2-dn03,demo-qa2-dn01,demo-qa2-dn02") .load() df.show(); Command: spark-submit --jars $(echo ./lib/*.jar | tr ' ' ','),$(echo ./conf/*.* | tr ' ' ','),/usr/hdp/2.4.2.0-258/hbase/lib/hbase-client-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-common-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-server-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-hadoop-compat-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-protocol-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/phoenix/phoenix-4.4.0.2.4.2.0-258-thin-client.jar,/usr/hdp/2.4.2.0-258/phoenix/lib/phoenix-core-4.4.0.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/phoenix/lib/phoenix-spark-4.4.0.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/phoenix-4.4.0.2.4.2.0-258-client.jar --driver-class-path $(echo ./lib/*.jar | tr ' ' ','),$(echo ./conf/*.* | tr ' ' ','),/usr/hdp/2.4.2.0-258/hbase/lib/hbase-client-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-common-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-protocol-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-server-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/hbase/lib/hbase-hadoop-compat-1.1.2.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/phoenix/lib/phoenix-spark-4.4.0.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/phoenix/lib/phoenix-core-4.4.0.2.4.2.0-258.jar,/usr/hdp/2.4.2.0-258/phoenix/phoenix-4.4.0.2.4.2.0-258-thin-client.jar,/usr/hdp/2.4.2.0-258/hbase/lib/phoenix-4.4.0.2.4.2.0-258-client.jar --master yarn-client --class com.xyz.demo.dq.DataQualityApplicationHandler tr-dq-16.7.0.0.0.jar org ss1 Phoenix tr-dq-job.properties QUALITY I added hbase-site.xml in spark conf directory on all nodes and restarted spark service but it didn't works. Also hbas-site.xml is already present in my classpath. My Phoenix version is 4.4 and spark version is 1.6. Also I followed workaround given in [PHOENIX-2817][1] and tried upgrading phoenix to 4.8, but it didn't work. [1]: https://issues.apache.org/jira/browse/PHOENIX-2817