Hi Hari, Now Iam trying out the same FlumeEventCount example running with spark-submit Instead of run example. The steps I followed is that I have exported the JavaFlumeEventCount.java into jar.
The command used is ./bin/spark-submit --jars lib/spark-examples-1.1.0-hadoop1.0.4.jar --master local --class org.JavaFlumeEventCount bin/flumeeventcnt2.jar localhost 2323 The output is 14/11/12 17:55:02 INFO scheduler.ReceiverTracker: Stream 0 received 1 blocks 14/11/12 17:55:02 INFO scheduler.JobScheduler: Added jobs for time 1415795102000 If I use this command ./bin/spark-submit --master local --class org.JavaFlumeEventCount bin/flumeeventcnt2.jar localhost 2323 Then I get an error Spark assembly has been built with Hive, including Datanucleus jars on classpath Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/examples/streaming/StreamingExamples at org.JavaFlumeEventCount.main(JavaFlumeEventCount.java:22) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.apache.spark.examples.streaming.StreamingExamples at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:423) at java.lang.ClassLoader.loadClass(ClassLoader.java:356) ... 8 more I Just wanted to ask is that it is able to find spark-assembly.jar but why not spark-example.jar. The next doubt is while running FlumeEventCount example through runexample I get an output as Received 4 flume events. 14/11/12 18:30:14 INFO scheduler.JobScheduler: Finished job streaming job 1415797214000 ms.0 from job set of time 1415797214000 ms 14/11/12 18:30:14 INFO rdd.MappedRDD: Removing RDD 70 from persistence list But If I run the same program through Spark-Submit I get an output as 14/11/12 17:55:02 INFO scheduler.ReceiverTracker: Stream 0 received 1 blocks 14/11/12 17:55:02 INFO scheduler.JobScheduler: Added jobs for time 1415795102000 So I need a clarification, since in the program the printing statement is written as " Received n flume events." So how come Iam able to see as "Stream 0 received n blocks". And what is the difference of running the program through spark-submit and run-example. Awaiting for your kind reply Regards, Jeniba Johnson ________________________________ The contents of this e-mail and any attachment(s) may contain confidential or privileged information for the intended recipient(s). Unintended recipients are prohibited from taking action on the basis of information in this e-mail and using or disseminating the information, and must notify the sender and delete it from their system. L&T Infotech will not accept responsibility or liability for the accuracy or completeness of, or the presence of any virus or disabling code in this e-mail"