Simple Java Program to Connect to Phoenix DB: SparkConf sparkConf = new SparkConf(); sparkConf.setAppName("Using-spark-phoenix-df"); sparkConf.setMaster("local[*]"); JavaSparkContext sc = new JavaSparkContext(sparkConf); SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc); Dataset<Row> fromPhx = sqlContext .read() .format("jdbc") .options( ImmutableMap.of("driver", "org.apache.phoenix.jdbc.PhoenixDriver", "url", "jdbc:phoenix:ZOOKEEPER_QUORUM_URL:/hbase", "dbtable", "STOCK")).load(); System.out.println(fromPhx.toJavaRDD().count());
spark-submit --class PhoenixToDataFrame \ --master yarn-client --deploy-mode client --executor-memory 1g \ --files /etc/hbase/conf/hbase-site.xml \ --name hbasespark --conf "spark.executor.extraClassPath=/usr/lib/phoenix/phoenix-spark-4.13.0-HBase-1.4.jar:/usr/lib/phoenix/phoenix-core-4.13.0-HBase-1.4.jar:/usr/lib/phoenix-4.13.0-HBase-1.4-client.jar:/usr/lib/phoenix-client.jar:/usr/lib/phoenix-server.jar" hbaseandspark-0.0.1-SNAPSHOT-jar-with-dependencies.jar ERROR: 18/05/06 01:25:06 INFO Utils: Successfully started service 'sparkDriver' on port 34053. 18/05/06 01:25:06 INFO SparkEnv: Registering MapOutputTracker 18/05/06 01:25:06 INFO SparkEnv: Registering BlockManagerMaster 18/05/06 01:25:06 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 18/05/06 01:25:06 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 18/05/06 01:25:06 INFO DiskBlockManager: Created local directory at /mnt/tmp/blockmgr-08e5b083-e2bb-4389-8b30-29a1be08ee7e 18/05/06 01:25:06 INFO MemoryStore: MemoryStore started with capacity 414.4 MB 18/05/06 01:25:07 INFO SparkEnv: Registering OutputCommitCoordinator 18/05/06 01:25:07 INFO Utils: Successfully started service 'SparkUI' on port 4040. 18/05/06 01:25:07 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.16.129.152:4040 18/05/06 01:25:07 INFO SparkContext: Added JAR file:/home/ec2-user/test/target/hbaseandspark-0.0.1-SNAPSHOT-jar-with-dependencies.jar at spark://10.16.129.152:34053/jars/hbaseandspark-0.0.1-SNAPSHOT-jar-with-dependencies.jar with timestamp 1525569907456 18/05/06 01:25:07 INFO Executor: Starting executor ID driver on host localhost 18/05/06 01:25:07 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 34659. 18/05/06 01:25:07 INFO NettyBlockTransferService: Server created on 10.16.129.152:34659 18/05/06 01:25:07 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 18/05/06 01:25:07 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.16.129.152, 34659, None) 18/05/06 01:25:07 INFO BlockManagerMasterEndpoint: Registering block manager 10.16.129.152:34659 with 414.4 MB RAM, BlockManagerId(driver, 10.16.129.152, 34659, None) 18/05/06 01:25:07 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.16.129.152, 34659, None) 18/05/06 01:25:07 INFO BlockManager: external shuffle service port = 7337 18/05/06 01:25:07 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.16.129.152, 34659, None) 18/05/06 01:25:09 INFO EventLoggingListener: Logging events to hdfs:///var/log/spark/apps/local-1525569907548 18/05/06 01:25:09 INFO SharedState: loading hive config file: file:/etc/spark/conf.dist/hive-site.xml 18/05/06 01:25:09 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('hdfs:///user/spark/warehouse'). 18/05/06 01:25:09 INFO SharedState: Warehouse path is 'hdfs:///user/spark/warehouse'. 18/05/06 01:25:11 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 18/05/06 01:25:12 INFO ConnectionQueryServicesImpl: An instance of ConnectionQueryServices was created. 18/05/06 01:25:12 WARN RpcControllerFactory: Cannot load configured "hbase.rpc.controllerfactory.class" (org.apache.hadoop.hbase.ipc.controller.ClientRpcControllerFactory) from hbase-site.xml, falling back to use default RpcControllerFactory 18/05/06 01:25:12 INFO RecoverableZooKeeper: Process identifier=hconnection-0x50110971 connecting to ZooKeeper ensemble=ZOOKEEPER_QUORUM_URL:2181 18/05/06 01:25:13 INFO MetricsConfig: loaded properties from hadoop-metrics2.properties 18/05/06 01:25:13 INFO MetricsSystemImpl: Scheduled Metric snapshot period at 300 second(s). 18/05/06 01:25:13 INFO MetricsSystemImpl: HBase metrics system started 18/05/06 01:25:13 INFO MetricRegistries: Loaded MetricRegistries class org.apache.hadoop.hbase.metrics.impl.MetricRegistriesImpl Exception in thread "main" java.sql.SQLException: ERROR 103 (08004): Unable to establish connection. at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:488) at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150) at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:417) at org.apache.phoenix.query.ConnectionQueryServicesImpl.access$400(ConnectionQueryServicesImpl.java:256) at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2408) at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2384) at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76) at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2384) at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255) at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150) at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221) at org.apache.spark.sql.execution.datasources.jdbc.DriverWrapper.connect(DriverWrapper.scala:45) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:61) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$createConnectionFactory$1.apply(JdbcUtils.scala:52) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:58) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:114) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:52) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:307) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:146) at PhoenixToDataFrame.main(PhoenixToDataFrame.java:48) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:775) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.IOException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:240) at org.apache.hadoop.hbase.client.ConnectionManager.createConnection(ConnectionManager.java:439) at org.apache.hadoop.hbase.client.ConnectionManager.createConnectionInternal(ConnectionManager.java:348) at org.apache.hadoop.hbase.client.HConnectionManager.createConnection(HConnectionManager.java:144) at org.apache.phoenix.query.HConnectionFactory$HConnectionFactoryImpl.createConnection(HConnectionFactory.java:47) at org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:408) ... 27 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:238) ... 32 more Caused by: java.lang.RuntimeException: Could not create interface org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSource Is the hadoop compatibility jar on the classpath? at org.apache.hadoop.hbase.CompatibilitySingletonFactory.getInstance(CompatibilitySingletonFactory.java:75) at org.apache.hadoop.hbase.zookeeper.MetricsZooKeeper.<init>(MetricsZooKeeper.java:38) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.<init>(RecoverableZooKeeper.java:130) at org.apache.hadoop.hbase.zookeeper.ZKUtil.connect(ZKUtil.java:143) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:181) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:155) at org.apache.hadoop.hbase.client.ZooKeeperKeepAliveConnection.<init>(ZooKeeperKeepAliveConnection.java:43) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.getKeepAliveZooKeeperWatcher(ConnectionManager.java:1737) at org.apache.hadoop.hbase.client.ZooKeeperRegistry.getClusterId(ZooKeeperRegistry.java:104) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.retrieveClusterId(ConnectionManager.java:945) at org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.<init>(ConnectionManager.java:721) ... 37 more Caused by: java.util.ServiceConfigurationError: org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSource: Provider org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSourceImpl could not be instantiated at java.util.ServiceLoader.fail(ServiceLoader.java:232) at java.util.ServiceLoader.access$100(ServiceLoader.java:185) at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384) at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) at java.util.ServiceLoader$1.next(ServiceLoader.java:480) at org.apache.hadoop.hbase.CompatibilitySingletonFactory.getInstance(CompatibilitySingletonFactory.java:59) ... 47 more Caused by: java.lang.IllegalAccessError: tried to access class org.apache.hadoop.metrics2.lib.MetricsInfoImpl from class org.apache.hadoop.metrics2.lib.DynamicMetricsRegistry at org.apache.hadoop.metrics2.lib.DynamicMetricsRegistry.newGauge(DynamicMetricsRegistry.java:139) at org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSourceImpl.<init>(MetricsZooKeeperSourceImpl.java:59) at org.apache.hadoop.hbase.zookeeper.MetricsZooKeeperSourceImpl.<init>(MetricsZooKeeperSourceImpl.java:51) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380) -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org