Re: Need help on spark Hbase
Hi Team, Now i've changed my code and reading configuration from hbase-site.xml file(this file is in classpath). When i run this program using : mvn exec:java -Dexec.mainClass=com.cisco.ana.accessavailability.AccessAvailability. It is working fine. But when i run this program from spark-submit i'm getting below exception Please find below exception : spark-submit command not able to found the HbaseConfiguration. How to resolve this issue? rajesh@rajesh-VirtualBox:~/Downloads/spark-1.0.0$ ./bin/spark-submit --master local --class com.cisco.ana.accessavailability.AccessAvailability --jars /home/rajesh/Downloads/MISC/ANA_Access/target/ANA_Access-0.0.1-SNAPSHOT.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-client-0.96.1.1-hadoop2.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-common-0.96.1.1-hadoop2.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-hadoop2-compat-0.96.1.1-hadoop2.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-it-0.96.1.1-hadoop2.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-protocol-0.96.1.1-hadoop2.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-server-0.96.1.1-hadoop2.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/htrace-core-2.01.jar, /home/rajesh/Downloads/spark-1.0.0/assembly/target/scala-2.10/spark-assembly-1.0.0-hadoop2.2.0.jar Warning: Local jar /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-client-0.96.1.1-hadoop2.jar, does not exist, skipping. Before *Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration* at com.cisco.ana.accessavailability.AccessAvailability.main(AccessAvailability.java:80) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:323) at java.lang.ClassLoader.loadClass(ClassLoader.java:268) ... 8 more Please find below code : public class AccessAvailability { public static void main(String[] args) throws Exception { System.out.println( Before); Configuration configuration = HBaseConfiguration.create(); System.out.println( After); SparkConf s = new SparkConf().setMaster(local); JavaStreamingContext ssc = new JavaStreamingContext(master,AccessAvailability, new Duration(4), sparkHome, ); JavaDStreamString lines_2 = ssc.textFileStream(hdfsfolderpath); } } Regards, Rajesh On Wed, Jul 16, 2014 at 5:39 AM, Krishna Sankar ksanka...@gmail.com wrote: Good catch. I thought the largest port number is 65535. Cheers k/ On Tue, Jul 15, 2014 at 4:33 PM, Spark DevUser spark.devu...@gmail.com wrote: Are you able to launch *hbase shell* and run some commands (list, describe, scan, etc)? Seems *configuration.set(hbase.**master, localhost:60)* is wrong. On Tue, Jul 15, 2014 at 3:00 PM, Tathagata Das tathagata.das1...@gmail.com wrote: Also, it helps if you post us logs, stacktraces, exceptions, etc. TD On Tue, Jul 15, 2014 at 10:07 AM, Jerry Lam chiling...@gmail.com wrote: Hi Rajesh, I have a feeling that this is not directly related to spark but I might be wrong. The reason why is that when you do: Configuration configuration = HBaseConfiguration.create(); by default, it reads the configuration files hbase-site.xml in your classpath and ... (I don't remember all the configuration files hbase has). I noticed that you overwrote some configuration settings in the code but I'm not if you have other configurations that might have conflicted with those. Could you try the following, remove anything that is spark specific leaving only hbase related codes. uber jar it and run it just like any other simple java program. If you still have connection issues, then at least you know the problem is from the configurations. HTH, Jerry On Tue, Jul 15, 2014 at 12:10 PM, Krishna Sankar ksanka...@gmail.com wrote: One vector to check is the HBase libraries in the --jars as in : spark-submit --class your class --master master url --jars hbase-client-0.98.3-hadoop2.jar,commons-csv-1.0-SNAPSHOT.jar,hbase-common-0.98.3-hadoop2.jar,hbase-hadoop2-compat-0.98.3-hadoop2.jar,hbase-it-0.98.3-hadoop2.jar,hbase-protocol-0.98.3-hadoop2.jar,hbase-server-0.98.3-hadoop2.jar,htrace-core-2.04.jar,spark-assembly-1.0.0-hadoop2.2.0.jar
Re: Need help on spark Hbase
Hi Rajesh, I saw : Warning: Local jar /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase -client-0.96.1.1-hadoop2.jar, does not exist, skipping. in your log. I believe this jar contains the HBaseConfiguration. I'm not sure what went wrong in your case but can you try without spaces in --jars i.e. --jars A.jar,B.jar,C.jar not --jars A.jar, B.jar, C.jar I'm just guessing because when I used --jars I never have spaces in it. HTH, Jerry On Wed, Jul 16, 2014 at 5:30 AM, Madabhattula Rajesh Kumar mrajaf...@gmail.com wrote: Hi Team, Now i've changed my code and reading configuration from hbase-site.xml file(this file is in classpath). When i run this program using : mvn exec:java -Dexec.mainClass=com.cisco.ana.accessavailability.AccessAvailability. It is working fine. But when i run this program from spark-submit i'm getting below exception Please find below exception : spark-submit command not able to found the HbaseConfiguration. How to resolve this issue? rajesh@rajesh-VirtualBox:~/Downloads/spark-1.0.0$ ./bin/spark-submit --master local --class com.cisco.ana.accessavailability.AccessAvailability --jars /home/rajesh/Downloads/MISC/ANA_Access/target/ANA_Access-0.0.1-SNAPSHOT.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-client-0.96.1.1-hadoop2.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-common-0.96.1.1-hadoop2.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-hadoop2-compat-0.96.1.1-hadoop2.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-it-0.96.1.1-hadoop2.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-protocol-0.96.1.1-hadoop2.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-server-0.96.1.1-hadoop2.jar, /home/rajesh/hbase-0.96.1.1-hadoop2/lib/htrace-core-2.01.jar, /home/rajesh/Downloads/spark-1.0.0/assembly/target/scala-2.10/spark-assembly-1.0.0-hadoop2.2.0.jar Warning: Local jar /home/rajesh/hbase-0.96.1.1-hadoop2/lib/hbase-client-0.96.1.1-hadoop2.jar, does not exist, skipping. Before *Exception in thread main java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration* at com.cisco.ana.accessavailability.AccessAvailability.main(AccessAvailability.java:80) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:622) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.HBaseConfiguration at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:323) at java.lang.ClassLoader.loadClass(ClassLoader.java:268) ... 8 more Please find below code : public class AccessAvailability { public static void main(String[] args) throws Exception { System.out.println( Before); Configuration configuration = HBaseConfiguration.create(); System.out.println( After); SparkConf s = new SparkConf().setMaster(local); JavaStreamingContext ssc = new JavaStreamingContext(master,AccessAvailability, new Duration(4), sparkHome, ); JavaDStreamString lines_2 = ssc.textFileStream(hdfsfolderpath); } } Regards, Rajesh On Wed, Jul 16, 2014 at 5:39 AM, Krishna Sankar ksanka...@gmail.com wrote: Good catch. I thought the largest port number is 65535. Cheers k/ On Tue, Jul 15, 2014 at 4:33 PM, Spark DevUser spark.devu...@gmail.com wrote: Are you able to launch *hbase shell* and run some commands (list, describe, scan, etc)? Seems *configuration.set(hbase.**master, localhost:60)* is wrong. On Tue, Jul 15, 2014 at 3:00 PM, Tathagata Das tathagata.das1...@gmail.com wrote: Also, it helps if you post us logs, stacktraces, exceptions, etc. TD On Tue, Jul 15, 2014 at 10:07 AM, Jerry Lam chiling...@gmail.com wrote: Hi Rajesh, I have a feeling that this is not directly related to spark but I might be wrong. The reason why is that when you do: Configuration configuration = HBaseConfiguration.create(); by default, it reads the configuration files hbase-site.xml in your classpath and ... (I don't remember all the configuration files hbase has). I noticed that you overwrote some configuration settings in the code but I'm not if you have other configurations that might have conflicted with those. Could you try the following, remove anything that is spark specific leaving only hbase related codes. uber jar it and run it just like any other simple java program. If you still
Re: Need help on spark Hbase
Hi Rajesh, can you describe your spark cluster setup? I saw localhost:2181 for zookeeper. Best Regards, Jerry On Tue, Jul 15, 2014 at 9:47 AM, Madabhattula Rajesh Kumar mrajaf...@gmail.com wrote: Hi Team, Could you please help me to resolve the issue. *Issue *: I'm not able to connect HBase from Spark-submit. Below is my code. When i execute below program in standalone, i'm able to connect to Hbase and doing the operation. When i execute below program using spark submit ( ./bin/spark-submit ) command, i'm not able to connect to hbase. Am i missing any thing? import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.Properties; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.client.Put; import org.apache.log4j.Logger; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.function.Function; import org.apache.spark.streaming.Duration; import org.apache.spark.streaming.api.java.JavaDStream; import org.apache.spark.streaming.api.java.JavaStreamingContext; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.client.HBaseAdmin; public class Test { public static void main(String[] args) throws Exception { JavaStreamingContext ssc = new JavaStreamingContext(local,Test, new Duration(4), sparkHome, ); JavaDStreamString lines_2 = ssc.textFileStream(hdfsfolderpath); Configuration configuration = HBaseConfiguration.create(); configuration.set(hbase.zookeeper.property.clientPort, 2181); configuration.set(hbase.zookeeper.quorum, localhost); configuration.set(hbase.master, localhost:60); HBaseAdmin hBaseAdmin = new HBaseAdmin(configuration); if (hBaseAdmin.tableExists(HABSE_TABLE)) { System.out.println( ANA_DATA table exists ..); } System.out.println( HELLO HELLO HELLO ); ssc.start(); ssc.awaitTermination(); } } Thank you for your help and support. Regards, Rajesh
Re: Need help on spark Hbase
One vector to check is the HBase libraries in the --jars as in : spark-submit --class your class --master master url --jars hbase-client-0.98.3-hadoop2.jar,commons-csv-1.0-SNAPSHOT.jar,hbase-common-0.98.3-hadoop2.jar,hbase-hadoop2-compat-0.98.3-hadoop2.jar,hbase-it-0.98.3-hadoop2.jar,hbase-protocol-0.98.3-hadoop2.jar,hbase-server-0.98.3-hadoop2.jar,htrace-core-2.04.jar,spark-assembly-1.0.0-hadoop2.2.0.jar badwclient.jar This worked for us. Cheers k/ On Tue, Jul 15, 2014 at 6:47 AM, Madabhattula Rajesh Kumar mrajaf...@gmail.com wrote: Hi Team, Could you please help me to resolve the issue. *Issue *: I'm not able to connect HBase from Spark-submit. Below is my code. When i execute below program in standalone, i'm able to connect to Hbase and doing the operation. When i execute below program using spark submit ( ./bin/spark-submit ) command, i'm not able to connect to hbase. Am i missing any thing? import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.Properties; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.client.Put; import org.apache.log4j.Logger; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.function.Function; import org.apache.spark.streaming.Duration; import org.apache.spark.streaming.api.java.JavaDStream; import org.apache.spark.streaming.api.java.JavaStreamingContext; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.client.HBaseAdmin; public class Test { public static void main(String[] args) throws Exception { JavaStreamingContext ssc = new JavaStreamingContext(local,Test, new Duration(4), sparkHome, ); JavaDStreamString lines_2 = ssc.textFileStream(hdfsfolderpath); Configuration configuration = HBaseConfiguration.create(); configuration.set(hbase.zookeeper.property.clientPort, 2181); configuration.set(hbase.zookeeper.quorum, localhost); configuration.set(hbase.master, localhost:60); HBaseAdmin hBaseAdmin = new HBaseAdmin(configuration); if (hBaseAdmin.tableExists(HABSE_TABLE)) { System.out.println( ANA_DATA table exists ..); } System.out.println( HELLO HELLO HELLO ); ssc.start(); ssc.awaitTermination(); } } Thank you for your help and support. Regards, Rajesh
Re: Need help on spark Hbase
Hi Rajesh, I have a feeling that this is not directly related to spark but I might be wrong. The reason why is that when you do: Configuration configuration = HBaseConfiguration.create(); by default, it reads the configuration files hbase-site.xml in your classpath and ... (I don't remember all the configuration files hbase has). I noticed that you overwrote some configuration settings in the code but I'm not if you have other configurations that might have conflicted with those. Could you try the following, remove anything that is spark specific leaving only hbase related codes. uber jar it and run it just like any other simple java program. If you still have connection issues, then at least you know the problem is from the configurations. HTH, Jerry On Tue, Jul 15, 2014 at 12:10 PM, Krishna Sankar ksanka...@gmail.com wrote: One vector to check is the HBase libraries in the --jars as in : spark-submit --class your class --master master url --jars hbase-client-0.98.3-hadoop2.jar,commons-csv-1.0-SNAPSHOT.jar,hbase-common-0.98.3-hadoop2.jar,hbase-hadoop2-compat-0.98.3-hadoop2.jar,hbase-it-0.98.3-hadoop2.jar,hbase-protocol-0.98.3-hadoop2.jar,hbase-server-0.98.3-hadoop2.jar,htrace-core-2.04.jar,spark-assembly-1.0.0-hadoop2.2.0.jar badwclient.jar This worked for us. Cheers k/ On Tue, Jul 15, 2014 at 6:47 AM, Madabhattula Rajesh Kumar mrajaf...@gmail.com wrote: Hi Team, Could you please help me to resolve the issue. *Issue *: I'm not able to connect HBase from Spark-submit. Below is my code. When i execute below program in standalone, i'm able to connect to Hbase and doing the operation. When i execute below program using spark submit ( ./bin/spark-submit ) command, i'm not able to connect to hbase. Am i missing any thing? import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.Properties; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.client.Put; import org.apache.log4j.Logger; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.function.Function; import org.apache.spark.streaming.Duration; import org.apache.spark.streaming.api.java.JavaDStream; import org.apache.spark.streaming.api.java.JavaStreamingContext; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.client.HBaseAdmin; public class Test { public static void main(String[] args) throws Exception { JavaStreamingContext ssc = new JavaStreamingContext(local,Test, new Duration(4), sparkHome, ); JavaDStreamString lines_2 = ssc.textFileStream(hdfsfolderpath); Configuration configuration = HBaseConfiguration.create(); configuration.set(hbase.zookeeper.property.clientPort, 2181); configuration.set(hbase.zookeeper.quorum, localhost); configuration.set(hbase.master, localhost:60); HBaseAdmin hBaseAdmin = new HBaseAdmin(configuration); if (hBaseAdmin.tableExists(HABSE_TABLE)) { System.out.println( ANA_DATA table exists ..); } System.out.println( HELLO HELLO HELLO ); ssc.start(); ssc.awaitTermination(); } } Thank you for your help and support. Regards, Rajesh
Re: Need help on spark Hbase
Also, it helps if you post us logs, stacktraces, exceptions, etc. TD On Tue, Jul 15, 2014 at 10:07 AM, Jerry Lam chiling...@gmail.com wrote: Hi Rajesh, I have a feeling that this is not directly related to spark but I might be wrong. The reason why is that when you do: Configuration configuration = HBaseConfiguration.create(); by default, it reads the configuration files hbase-site.xml in your classpath and ... (I don't remember all the configuration files hbase has). I noticed that you overwrote some configuration settings in the code but I'm not if you have other configurations that might have conflicted with those. Could you try the following, remove anything that is spark specific leaving only hbase related codes. uber jar it and run it just like any other simple java program. If you still have connection issues, then at least you know the problem is from the configurations. HTH, Jerry On Tue, Jul 15, 2014 at 12:10 PM, Krishna Sankar ksanka...@gmail.com wrote: One vector to check is the HBase libraries in the --jars as in : spark-submit --class your class --master master url --jars hbase-client-0.98.3-hadoop2.jar,commons-csv-1.0-SNAPSHOT.jar,hbase-common-0.98.3-hadoop2.jar,hbase-hadoop2-compat-0.98.3-hadoop2.jar,hbase-it-0.98.3-hadoop2.jar,hbase-protocol-0.98.3-hadoop2.jar,hbase-server-0.98.3-hadoop2.jar,htrace-core-2.04.jar,spark-assembly-1.0.0-hadoop2.2.0.jar badwclient.jar This worked for us. Cheers k/ On Tue, Jul 15, 2014 at 6:47 AM, Madabhattula Rajesh Kumar mrajaf...@gmail.com wrote: Hi Team, Could you please help me to resolve the issue. *Issue *: I'm not able to connect HBase from Spark-submit. Below is my code. When i execute below program in standalone, i'm able to connect to Hbase and doing the operation. When i execute below program using spark submit ( ./bin/spark-submit ) command, i'm not able to connect to hbase. Am i missing any thing? import java.util.HashMap; import java.util.List; import java.util.Map; import java.util.Properties; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.hbase.HBaseConfiguration; import org.apache.hadoop.hbase.client.Put; import org.apache.log4j.Logger; import org.apache.spark.SparkConf; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.function.Function; import org.apache.spark.streaming.Duration; import org.apache.spark.streaming.api.java.JavaDStream; import org.apache.spark.streaming.api.java.JavaStreamingContext; import org.apache.hadoop.hbase.HTableDescriptor; import org.apache.hadoop.hbase.client.HBaseAdmin; public class Test { public static void main(String[] args) throws Exception { JavaStreamingContext ssc = new JavaStreamingContext(local,Test, new Duration(4), sparkHome, ); JavaDStreamString lines_2 = ssc.textFileStream(hdfsfolderpath); Configuration configuration = HBaseConfiguration.create(); configuration.set(hbase.zookeeper.property.clientPort, 2181); configuration.set(hbase.zookeeper.quorum, localhost); configuration.set(hbase.master, localhost:60); HBaseAdmin hBaseAdmin = new HBaseAdmin(configuration); if (hBaseAdmin.tableExists(HABSE_TABLE)) { System.out.println( ANA_DATA table exists ..); } System.out.println( HELLO HELLO HELLO ); ssc.start(); ssc.awaitTermination(); } } Thank you for your help and support. Regards, Rajesh