Can it be that you are missing the HBASE_HOME var ? 

Jorge Machado






> On 23 Feb 2018, at 04:55, Dharmin Siddesh J <siddeshjdhar...@gmail.com> wrote:
> 
> I am trying to write a Spark program that reads data from HBase and store it 
> in DataFrame.
> 
> I am able to run it perfectly with hbase-site.xml in the $SPARK_HOME/conf 
> folder, but I am facing few issues here.
> 
> Issue 1
> 
> The first issue is passing hbase-site.xml location with the --files parameter 
> submitted through client mode (it works in cluster mode).
> 
> 
> 
> When I removed hbase-site.xml from $SPARK_HOME/conf and tried to execute it 
> in client mode by passing with the --files parameter over YARN I keep getting 
> the an exception (which I think means it is not taking the ZooKeeper 
> configuration from hbase-site.xml.
> 
> spark-submit \
> 
>   --master yarn \
> 
>   --deploy-mode client \
> 
>   --files /home/siddesh/hbase-site.xml \
> 
>   --class com.orzota.rs.json.HbaseConnector \
> 
>   --packages com.hortonworks:shc:1.0.0-2.0-s_2.11 \
> 
>   --repositories http://repo.hortonworks.com/content/groups/public/ 
> <http://repo.hortonworks.com/content/groups/public/> \
> 
>   target/scala-2.11/test-0.1-SNAPSHOT.jar
> 
>     at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
> 
> 18/02/22 01:43:09 INFO ClientCnxn: Opening socket connection to server 
> localhost/0:0:0:0:0:0:0:1:2181. Will not attempt to authenticate using SASL 
> (unknown error)
> 
> 18/02/22 01:43:09 WARN ClientCnxn: Session 0x0 for server null, unexpected 
> error, closing socket connection and attempting reconnect
> 
> java.net.ConnectException: Connection refused
> 
>         at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 
>         at 
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> 
>         at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
> 
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1125)
> 
> However it works good when I run it in cluster mode.
> 
> 
> 
> Issue 2
> 
> Passing the HBase configuration details through the Spark session, which I 
> can't get to work in both client and cluster mode.
> 
> 
> 
> Instead of passing the entire hbase-site.xml I am trying to add the 
> configuration directly in the code by adding it as a configuration parameter 
> in the SparkSession, e.g.:
> 
> 
> 
> val spark = SparkSession
> 
>   .builder()
> 
>   .appName(name)
> 
>   .config("hbase.zookeeper.property.clientPort", "2181")
> 
>   .config("hbase.zookeeper.quorum", "ip1,ip2,ip3")
> 
>   .config("spark.hbase.host","zookeeperquorum")
> 
>   .getOrCreate()
> 
> 
> 
> val json_df =
> 
>   spark.read.option("catalog",catalog_read).
> 
>   format("org.apache.spark.sql.execution.datasources.hbase").
> 
>   load()
> 
> This is not working in cluster mode either.
> 
> 
> 
> Can anyone help me with a solution or explanation why this is happening are 
> there any workarounds?
> 
> 
> 

Reply via email to