Hi Hari,

Now Iam trying out the same FlumeEventCount example running with spark-submit 
Instead of run example. The steps I followed is that I have exported the 
JavaFlumeEventCount.java into jar.

The command used is
./bin/spark-submit --jars lib/spark-examples-1.1.0-hadoop1.0.4.jar --master 
local --class org.JavaFlumeEventCount  bin/flumeeventcnt2.jar  localhost 2323

The output is
14/11/12 17:55:02 INFO scheduler.ReceiverTracker: Stream 0 received 1 blocks
14/11/12 17:55:02 INFO scheduler.JobScheduler: Added jobs for time 1415795102000

If I use this command
 ./bin/spark-submit --master local --class org.JavaFlumeEventCount 
bin/flumeeventcnt2.jar  localhost 2323

Then I get an error
Spark assembly has been built with Hive, including Datanucleus jars on classpath
Exception in thread "main" java.lang.NoClassDefFoundError: 
        at org.JavaFlumeEventCount.main(JavaFlumeEventCount.java:22)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.lang.reflect.Method.invoke(Method.java:601)
        at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: 
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
        ... 8 more

I Just wanted to ask is  that it is able to  find spark-assembly.jar but why 
not spark-example.jar.
The next doubt is  while running FlumeEventCount example through runexample

I get an output as
Received 4 flume events.

14/11/12 18:30:14 INFO scheduler.JobScheduler: Finished job streaming job 
1415797214000 ms.0 from job set of time 1415797214000 ms
14/11/12 18:30:14 INFO rdd.MappedRDD: Removing RDD 70 from persistence list

But If I run the same program through Spark-Submit

I get an output as
14/11/12 17:55:02 INFO scheduler.ReceiverTracker: Stream 0 received 1 blocks
14/11/12 17:55:02 INFO scheduler.JobScheduler: Added jobs for time 1415795102000

So I need a clarification, since in the program the printing statement is 
written as " Received n flume events." So how come Iam able to see as "Stream 0 
received n blocks".
And what is the difference of running the program through spark-submit and 

Awaiting for your kind reply

Jeniba Johnson

The contents of this e-mail and any attachment(s) may contain confidential or 
privileged information for the intended recipient(s). Unintended recipients are 
prohibited from taking action on the basis of information in this e-mail and 
using or disseminating the information, and must notify the sender and delete 
it from their system. L&T Infotech will not accept responsibility or liability 
for the accuracy or completeness of, or the presence of any virus or disabling 
code in this e-mail"

Reply via email to