Re: java.lang.ClassNotFoundException for s3a comitter

2020-07-21 Thread Gourav Sengupta
Hi,

I am not sure about this but is there any requirement to use S3a at all ?


Regards,
Gourav

On Tue, Jul 21, 2020 at 12:07 PM Steve Loughran 
wrote:

>
>
> On Tue, 7 Jul 2020 at 03:42, Stephen Coy 
> wrote:
>
>> Hi Steve,
>>
>> While I understand your point regarding the mixing of Hadoop jars, this
>> does not address the java.lang.ClassNotFoundException.
>>
>> Prebuilt Apache Spark 3.0 builds are only available for Hadoop 2.7 or
>> Hadoop 3.2. Not Hadoop 3.1.
>>
>
> sorry, I should have been clearer. Hadoop 3.2.x has everything you need.
>
>
>
>>
>> The only place that I have found that missing class is in the Spark
>> “hadoop-cloud” source module, and currently the only way to get the jar
>> containing it is to build it yourself. If any of the devs are listening it
>>  would be nice if this was included in the standard distribution. It has a
>> sizeable chunk of a repackaged Jetty embedded in it which I find a bit odd.
>>
>> But I am relatively new to this stuff so I could be wrong.
>>
>> I am currently running Spark 3.0 clusters with no HDFS. Spark is set up
>> like:
>>
>> hadoopConfiguration.set("spark.hadoop.fs.s3a.committer.name",
>> "directory");
>> hadoopConfiguration.set("spark.sql.sources.commitProtocolClass",
>> "org.apache.spark.internal.io.cloud.PathOutputCommitProtocol");
>> hadoopConfiguration.set("spark.sql.parquet.output.committer.class",
>> "org.apache.spark.internal.io.cloud.BindingParquetOutputCommitter");
>> hadoopConfiguration.set("fs.s3a.connection.maximum",
>> Integer.toString(coreCount * 2));
>>
>> Querying and updating s3a data sources seems to be working ok.
>>
>> Thanks,
>>
>> Steve C
>>
>> On 29 Jun 2020, at 10:34 pm, Steve Loughran 
>> wrote:
>>
>> you are going to need hadoop-3.1 on your classpath, with hadoop-aws and
>> the same aws-sdk it was built with (1.11.something). Mixing hadoop JARs is
>> doomed. using a different aws sdk jar is a bit risky, though more recent
>> upgrades have all be fairly low stress
>>
>> On Fri, 19 Jun 2020 at 05:39, murat migdisoglu <
>> murat.migdiso...@gmail.com> wrote:
>>
>>> Hi all
>>> I've upgraded my test cluster to spark 3 and change my comitter to
>>> directory and I still get this error.. The documentations are somehow
>>> obscure on that.
>>> Do I need to add a third party jar to support new comitters?
>>>
>>> java.lang.ClassNotFoundException:
>>> org.apache.spark.internal.io.cloud.PathOutputCommitProtocol
>>>
>>>
>>> On Thu, Jun 18, 2020 at 1:35 AM murat migdisoglu <
>>> murat.migdiso...@gmail.com> wrote:
>>>
>>>> Hello all,
>>>> we have a hadoop cluster (using yarn) using  s3 as filesystem with
>>>> s3guard is enabled.
>>>> We are using hadoop 3.2.1 with spark 2.4.5.
>>>>
>>>> When I try to save a dataframe in parquet format, I get the following
>>>> exception:
>>>> java.lang.ClassNotFoundException:
>>>> com.hortonworks.spark.cloud.commit.PathOutputCommitProtocol
>>>>
>>>> My relevant spark configurations are as following:
>>>>
>>>> "hadoop.mapreduce.outputcommitter.factory.scheme.s3a":"org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory",
>>>> "fs.s3a.committer.name
>>>> <https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ffs.s3a.committer.name%2F&data=02%7C01%7Cscoy%40infomedia.com.au%7C25d6f7b564dd4cb53e5508d81c28e645%7C45d5407150f849caa59f9457123dc71c%7C0%7C0%7C637290309277792405&sdata=jxbuOsgSShhHZcXjrjkZmJ4DCXIXstzRFSOaOEEadRE%3D&reserved=0>":
>>>> "magic",
>>>> "fs.s3a.committer.magic.enabled": true,
>>>> "fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem",
>>>>
>>>> While spark streaming fails with the exception above, apache beam
>>>> succeeds writing parquet files.
>>>> What might be the problem?
>>>>
>>>> Thanks in advance
>>>>
>>>>
>>>> --
>>>> "Talkers aren’t good doers. Rest assured that we’re going there to use
>>>> our hands, not our tongues."
>>>> W. Shakespeare
>>>>
>>>
>>>
>>> --
>>> "Talkers aren’t good doers. Rest assured that we’re going there to use
>>> our hands, not our tongues."
>>> W. Shakespeare
>>>
>>
>>
>> <https://www.infomedia.com.au/driving-force/?utm_campaign=200630%20Email%20Signature&utm_source=Internal&utm_medium=Email&utm_content=Driving%20Force>
>> This email contains confidential information of and is the copyright of
>> Infomedia. It must not be forwarded, amended or disclosed without consent
>> of the sender. If you received this message by mistake, please advise the
>> sender and delete all copies. Security of transmission on the internet
>> cannot be guaranteed, could be infected, intercepted, or corrupted and you
>> should ensure you have suitable antivirus protection in place. By sending
>> us your or any third party personal details, you consent to (or confirm you
>> have obtained consent from such third parties) to Infomedia’s privacy
>> policy. http://www.infomedia.com.au/privacy-policy/
>>
>


Re: java.lang.ClassNotFoundException for s3a comitter

2020-07-21 Thread Steve Loughran
On Tue, 7 Jul 2020 at 03:42, Stephen Coy 
wrote:

> Hi Steve,
>
> While I understand your point regarding the mixing of Hadoop jars, this
> does not address the java.lang.ClassNotFoundException.
>
> Prebuilt Apache Spark 3.0 builds are only available for Hadoop 2.7 or
> Hadoop 3.2. Not Hadoop 3.1.
>

sorry, I should have been clearer. Hadoop 3.2.x has everything you need.



>
> The only place that I have found that missing class is in the Spark
> “hadoop-cloud” source module, and currently the only way to get the jar
> containing it is to build it yourself. If any of the devs are listening it
>  would be nice if this was included in the standard distribution. It has a
> sizeable chunk of a repackaged Jetty embedded in it which I find a bit odd.
>
> But I am relatively new to this stuff so I could be wrong.
>
> I am currently running Spark 3.0 clusters with no HDFS. Spark is set up
> like:
>
> hadoopConfiguration.set("spark.hadoop.fs.s3a.committer.name",
> "directory");
> hadoopConfiguration.set("spark.sql.sources.commitProtocolClass",
> "org.apache.spark.internal.io.cloud.PathOutputCommitProtocol");
> hadoopConfiguration.set("spark.sql.parquet.output.committer.class",
> "org.apache.spark.internal.io.cloud.BindingParquetOutputCommitter");
> hadoopConfiguration.set("fs.s3a.connection.maximum",
> Integer.toString(coreCount * 2));
>
> Querying and updating s3a data sources seems to be working ok.
>
> Thanks,
>
> Steve C
>
> On 29 Jun 2020, at 10:34 pm, Steve Loughran 
> wrote:
>
> you are going to need hadoop-3.1 on your classpath, with hadoop-aws and
> the same aws-sdk it was built with (1.11.something). Mixing hadoop JARs is
> doomed. using a different aws sdk jar is a bit risky, though more recent
> upgrades have all be fairly low stress
>
> On Fri, 19 Jun 2020 at 05:39, murat migdisoglu 
> wrote:
>
>> Hi all
>> I've upgraded my test cluster to spark 3 and change my comitter to
>> directory and I still get this error.. The documentations are somehow
>> obscure on that.
>> Do I need to add a third party jar to support new comitters?
>>
>> java.lang.ClassNotFoundException:
>> org.apache.spark.internal.io.cloud.PathOutputCommitProtocol
>>
>>
>> On Thu, Jun 18, 2020 at 1:35 AM murat migdisoglu <
>> murat.migdiso...@gmail.com> wrote:
>>
>>> Hello all,
>>> we have a hadoop cluster (using yarn) using  s3 as filesystem with
>>> s3guard is enabled.
>>> We are using hadoop 3.2.1 with spark 2.4.5.
>>>
>>> When I try to save a dataframe in parquet format, I get the following
>>> exception:
>>> java.lang.ClassNotFoundException:
>>> com.hortonworks.spark.cloud.commit.PathOutputCommitProtocol
>>>
>>> My relevant spark configurations are as following:
>>>
>>> "hadoop.mapreduce.outputcommitter.factory.scheme.s3a":"org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory",
>>> "fs.s3a.committer.name
>>> <https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ffs.s3a.committer.name%2F&data=02%7C01%7Cscoy%40infomedia.com.au%7C25d6f7b564dd4cb53e5508d81c28e645%7C45d5407150f849caa59f9457123dc71c%7C0%7C0%7C637290309277792405&sdata=jxbuOsgSShhHZcXjrjkZmJ4DCXIXstzRFSOaOEEadRE%3D&reserved=0>":
>>> "magic",
>>> "fs.s3a.committer.magic.enabled": true,
>>> "fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem",
>>>
>>> While spark streaming fails with the exception above, apache beam
>>> succeeds writing parquet files.
>>> What might be the problem?
>>>
>>> Thanks in advance
>>>
>>>
>>> --
>>> "Talkers aren’t good doers. Rest assured that we’re going there to use
>>> our hands, not our tongues."
>>> W. Shakespeare
>>>
>>
>>
>> --
>> "Talkers aren’t good doers. Rest assured that we’re going there to use
>> our hands, not our tongues."
>> W. Shakespeare
>>
>
>
> <https://www.infomedia.com.au/driving-force/?utm_campaign=200630%20Email%20Signature&utm_source=Internal&utm_medium=Email&utm_content=Driving%20Force>
> This email contains confidential information of and is the copyright of
> Infomedia. It must not be forwarded, amended or disclosed without consent
> of the sender. If you received this message by mistake, please advise the
> sender and delete all copies. Security of transmission on the internet
> cannot be guaranteed, could be infected, intercepted, or corrupted and you
> should ensure you have suitable antivirus protection in place. By sending
> us your or any third party personal details, you consent to (or confirm you
> have obtained consent from such third parties) to Infomedia’s privacy
> policy. http://www.infomedia.com.au/privacy-policy/
>


Re: java.lang.ClassNotFoundException for s3a comitter

2020-07-06 Thread Stephen Coy
Hi Steve,

While I understand your point regarding the mixing of Hadoop jars, this does 
not address the java.lang.ClassNotFoundException.

Prebuilt Apache Spark 3.0 builds are only available for Hadoop 2.7 or Hadoop 
3.2. Not Hadoop 3.1.

The only place that I have found that missing class is in the Spark 
“hadoop-cloud” source module, and currently the only way to get the jar 
containing it is to build it yourself. If any of the devs are listening it  
would be nice if this was included in the standard distribution. It has a 
sizeable chunk of a repackaged Jetty embedded in it which I find a bit odd.

But I am relatively new to this stuff so I could be wrong.

I am currently running Spark 3.0 clusters with no HDFS. Spark is set up like:

hadoopConfiguration.set("spark.hadoop.fs.s3a.committer.name", "directory");
hadoopConfiguration.set("spark.sql.sources.commitProtocolClass", 
"org.apache.spark.internal.io.cloud.PathOutputCommitProtocol");
hadoopConfiguration.set("spark.sql.parquet.output.committer.class", 
"org.apache.spark.internal.io.cloud.BindingParquetOutputCommitter");
hadoopConfiguration.set("fs.s3a.connection.maximum", Integer.toString(coreCount 
* 2));

Querying and updating s3a data sources seems to be working ok.

Thanks,

Steve C

On 29 Jun 2020, at 10:34 pm, Steve Loughran 
mailto:ste...@cloudera.com.INVALID>> wrote:

you are going to need hadoop-3.1 on your classpath, with hadoop-aws and the 
same aws-sdk it was built with (1.11.something). Mixing hadoop JARs is doomed. 
using a different aws sdk jar is a bit risky, though more recent upgrades have 
all be fairly low stress

On Fri, 19 Jun 2020 at 05:39, murat migdisoglu 
mailto:murat.migdiso...@gmail.com>> wrote:
Hi all
I've upgraded my test cluster to spark 3 and change my comitter to directory 
and I still get this error.. The documentations are somehow obscure on that.
Do I need to add a third party jar to support new comitters?

java.lang.ClassNotFoundException: 
org.apache.spark.internal.io.cloud.PathOutputCommitProtocol


On Thu, Jun 18, 2020 at 1:35 AM murat migdisoglu 
mailto:murat.migdiso...@gmail.com>> wrote:
Hello all,
we have a hadoop cluster (using yarn) using  s3 as filesystem with s3guard is 
enabled.
We are using hadoop 3.2.1 with spark 2.4.5.

When I try to save a dataframe in parquet format, I get the following exception:
java.lang.ClassNotFoundException: 
com.hortonworks.spark.cloud.commit.PathOutputCommitProtocol

My relevant spark configurations are as following:
"hadoop.mapreduce.outputcommitter.factory.scheme.s3a":"org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory",
"fs.s3a.committer.name<https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ffs.s3a.committer.name%2F&data=02%7C01%7Cscoy%40infomedia.com.au%7C25d6f7b564dd4cb53e5508d81c28e645%7C45d5407150f849caa59f9457123dc71c%7C0%7C0%7C637290309277792405&sdata=jxbuOsgSShhHZcXjrjkZmJ4DCXIXstzRFSOaOEEadRE%3D&reserved=0>":
 "magic",
"fs.s3a.committer.magic.enabled": true,
"fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem",

While spark streaming fails with the exception above, apache beam succeeds 
writing parquet files.
What might be the problem?

Thanks in advance


--
"Talkers aren’t good doers. Rest assured that we’re going there to use our 
hands, not our tongues."
W. Shakespeare


--
"Talkers aren’t good doers. Rest assured that we’re going there to use our 
hands, not our tongues."
W. Shakespeare


[http://downloads.ifmsystems.com/data/marketing/images/signatures/driving-force-newsletter.jpg]<https://www.infomedia.com.au/driving-force/?utm_campaign=200630%20Email%20Signature&utm_source=Internal&utm_medium=Email&utm_content=Driving%20Force>

This email contains confidential information of and is the copyright of 
Infomedia. It must not be forwarded, amended or disclosed without consent of 
the sender. If you received this message by mistake, please advise the sender 
and delete all copies. Security of transmission on the internet cannot be 
guaranteed, could be infected, intercepted, or corrupted and you should ensure 
you have suitable antivirus protection in place. By sending us your or any 
third party personal details, you consent to (or confirm you have obtained 
consent from such third parties) to Infomedia’s privacy policy. 
http://www.infomedia.com.au/privacy-policy/


Re: java.lang.ClassNotFoundException for s3a comitter

2020-06-29 Thread Steve Loughran
you are going to need hadoop-3.1 on your classpath, with hadoop-aws and the
same aws-sdk it was built with (1.11.something). Mixing hadoop JARs is
doomed. using a different aws sdk jar is a bit risky, though more recent
upgrades have all be fairly low stress

On Fri, 19 Jun 2020 at 05:39, murat migdisoglu 
wrote:

> Hi all
> I've upgraded my test cluster to spark 3 and change my comitter to
> directory and I still get this error.. The documentations are somehow
> obscure on that.
> Do I need to add a third party jar to support new comitters?
>
> java.lang.ClassNotFoundException:
> org.apache.spark.internal.io.cloud.PathOutputCommitProtocol
>
>
> On Thu, Jun 18, 2020 at 1:35 AM murat migdisoglu <
> murat.migdiso...@gmail.com> wrote:
>
>> Hello all,
>> we have a hadoop cluster (using yarn) using  s3 as filesystem with
>> s3guard is enabled.
>> We are using hadoop 3.2.1 with spark 2.4.5.
>>
>> When I try to save a dataframe in parquet format, I get the following
>> exception:
>> java.lang.ClassNotFoundException:
>> com.hortonworks.spark.cloud.commit.PathOutputCommitProtocol
>>
>> My relevant spark configurations are as following:
>>
>> "hadoop.mapreduce.outputcommitter.factory.scheme.s3a":"org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory",
>> "fs.s3a.committer.name": "magic",
>> "fs.s3a.committer.magic.enabled": true,
>> "fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem",
>>
>> While spark streaming fails with the exception above, apache beam
>> succeeds writing parquet files.
>> What might be the problem?
>>
>> Thanks in advance
>>
>>
>> --
>> "Talkers aren’t good doers. Rest assured that we’re going there to use
>> our hands, not our tongues."
>> W. Shakespeare
>>
>
>
> --
> "Talkers aren’t good doers. Rest assured that we’re going there to use
> our hands, not our tongues."
> W. Shakespeare
>


Re: java.lang.ClassNotFoundException for s3a comitter

2020-06-18 Thread Stephen Coy
Hi Murat Migdisoglu,

Unfortunately you need the secret sauce to resolve this.

It is necessary to check out the Apache Spark source code and build it with the 
right command line options. This is what I have been using:

dev/make-distribution.sh --name my-spark --tgz -Pyarn -Phadoop-3.2  -Pyarn 
-Phadoop-cloud -Dhadoop.version=3.2.1

This will add additional jars into the build.

Copy hadoop-aws-3.2.1.jar, hadoop-openstack-3.2.1.jar and 
spark-hadoop-cloud_2.12-3.0.0.jar into the “jars” directory of your Spark 
distribution. If you are paranoid you could copy/replace all the 
hadoop-*-3.2.1.jar files but I have not found that necessary.

You will also need to upgrade the version of guava that appears in the spark 
distro because Hadoop 3.2.1 bumped this from guava-14.0.1.jar to 
guava-27.0-jre.jar. Otherwise you will get runtime ClassNotFound exceptions.

I have been using this combo for many months now with the Spark 3.0 
pre-releases and it has been working great.

Cheers,

Steve C


On 19 Jun 2020, at 10:24 am, murat migdisoglu 
mailto:murat.migdiso...@gmail.com>> wrote:

Hi all
I've upgraded my test cluster to spark 3 and change my comitter to directory 
and I still get this error.. The documentations are somehow obscure on that.
Do I need to add a third party jar to support new comitters?

java.lang.ClassNotFoundException: 
org.apache.spark.internal.io.cloud.PathOutputCommitProtocol


On Thu, Jun 18, 2020 at 1:35 AM murat migdisoglu 
mailto:murat.migdiso...@gmail.com>> wrote:
Hello all,
we have a hadoop cluster (using yarn) using  s3 as filesystem with s3guard is 
enabled.
We are using hadoop 3.2.1 with spark 2.4.5.

When I try to save a dataframe in parquet format, I get the following exception:
java.lang.ClassNotFoundException: 
com.hortonworks.spark.cloud.commit.PathOutputCommitProtocol

My relevant spark configurations are as following:
"hadoop.mapreduce.outputcommitter.factory.scheme.s3a":"org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory",
"fs.s3a.committer.name<https://aus01.safelinks.protection.outlook.com/?url=http%3A%2F%2Ffs.s3a.committer.name%2F&data=02%7C01%7Cscoy%40infomedia.com.au%7C0725287744754aed9c5108d813e71e6e%7C45d5407150f849caa59f9457123dc71c%7C0%7C0%7C637281230668124994&sdata=n6l70htGxJ1q%2BcWH21RWIML7eGdE26UCdY8cDsufY6o%3D&reserved=0>":
 "magic",
"fs.s3a.committer.magic.enabled": true,
"fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem",

While spark streaming fails with the exception above, apache beam succeeds 
writing parquet files.
What might be the problem?

Thanks in advance


--
"Talkers aren’t good doers. Rest assured that we’re going there to use our 
hands, not our tongues."
W. Shakespeare


--
"Talkers aren’t good doers. Rest assured that we’re going there to use our 
hands, not our tongues."
W. Shakespeare

This email contains confidential information of and is the copyright of 
Infomedia. It must not be forwarded, amended or disclosed without consent of 
the sender. If you received this message by mistake, please advise the sender 
and delete all copies. Security of transmission on the internet cannot be 
guaranteed, could be infected, intercepted, or corrupted and you should ensure 
you have suitable antivirus protection in place. By sending us your or any 
third party personal details, you consent to (or confirm you have obtained 
consent from such third parties) to Infomedia’s privacy policy. 
http://www.infomedia.com.au/privacy-policy/


Re: java.lang.ClassNotFoundException for s3a comitter

2020-06-18 Thread murat migdisoglu
Hi all
I've upgraded my test cluster to spark 3 and change my comitter to
directory and I still get this error.. The documentations are somehow
obscure on that.
Do I need to add a third party jar to support new comitters?

java.lang.ClassNotFoundException:
org.apache.spark.internal.io.cloud.PathOutputCommitProtocol


On Thu, Jun 18, 2020 at 1:35 AM murat migdisoglu 
wrote:

> Hello all,
> we have a hadoop cluster (using yarn) using  s3 as filesystem with s3guard
> is enabled.
> We are using hadoop 3.2.1 with spark 2.4.5.
>
> When I try to save a dataframe in parquet format, I get the following
> exception:
> java.lang.ClassNotFoundException:
> com.hortonworks.spark.cloud.commit.PathOutputCommitProtocol
>
> My relevant spark configurations are as following:
>
> "hadoop.mapreduce.outputcommitter.factory.scheme.s3a":"org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory",
> "fs.s3a.committer.name": "magic",
> "fs.s3a.committer.magic.enabled": true,
> "fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem",
>
> While spark streaming fails with the exception above, apache beam succeeds
> writing parquet files.
> What might be the problem?
>
> Thanks in advance
>
>
> --
> "Talkers aren’t good doers. Rest assured that we’re going there to use
> our hands, not our tongues."
> W. Shakespeare
>


-- 
"Talkers aren’t good doers. Rest assured that we’re going there to use our
hands, not our tongues."
W. Shakespeare


java.lang.ClassNotFoundException: com.hortonworks.spark.cloud.commit.PathOutputCommitProtoco

2020-06-17 Thread murat migdisoglu
Hello all,
we have a hadoop cluster (using yarn) using  s3 as filesystem with s3guard
is enabled.
We are using hadoop 3.2.1 with spark 2.4.5.

When I try to save a dataframe in parquet format, I get the following
exception:
java.lang.ClassNotFoundException:
com.hortonworks.spark.cloud.commit.PathOutputCommitProtocol

My relevant spark configurations are as following:
"hadoop.mapreduce.outputcommitter.factory.scheme.s3a":"org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory",
"fs.s3a.committer.name": "magic",
"fs.s3a.committer.magic.enabled": true,
"fs.s3a.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem",

While spark streaming fails with the exception above, apache beam succeeds
writing parquet files.
What might be the problem?

Thanks in advance


-- 
"Talkers aren’t good doers. Rest assured that we’re going there to use our
hands, not our tongues."
W. Shakespeare


User class threw exception: java.lang.ClassNotFoundException: Failed to find data source: kafka. Please find packages at http://spark.apache.org/third-party-projects.html

2018-04-27 Thread amit kumar singh
Hi Team,

I am working on structured streaming

i have added all libraries in build,sbt then also its not picking up right
library an failing with error

User class threw exception: java.lang.ClassNotFoundException: Failed to
find data source: kafka. Please find packages at
http://spark.apache.org/third-party-projects.html

i am using jenkins to deploy this task

thanks
amit


Re: Spark Streaming - java.lang.ClassNotFoundException Scala anonymous function

2017-03-01 Thread Dominik Safaric
The jars I am submitting are the following:

bin/spark-submit --class topology.SimpleProcessingTopology --master 
spark://10.0.0.8:7077 --jars /tmp/spark_streaming-1.0-SNAPSHOT.jar 
/tmp//tmp/spark_streaming-1.0-SNAPSHOT.jar /tmp/streaming.properties

I’ve even tried using the spark.executor.extraClassPath option but 
unfortunately unsuccessfully. 

What do you mean by conflicting copies of Spark classes? Could you elaborate it?

> On 1 Mar 2017, at 14:51, Sean Owen  wrote:
> 
> What is the --jars you are submitting? You may have conflicting copies of 
> Spark classes that interfere.
> 
> 
> On Wed, Mar 1, 2017, 14:20 Dominik Safaric  <mailto:dominiksafa...@gmail.com>> wrote:
> I've been trying to submit a Spark Streaming application using spark-submit 
> to a cluster of mine consisting of a master and two worker nodes. The 
> application has been written in Scala, and build using Maven. Importantly, 
> the Maven build is configured to produce a fat JAR containing all 
> dependencies. Furthermore, the JAR has been distributed to all of nodes. The 
> streaming job has been submitted using the following command:
> 
> bin/spark-submit --class topology.SimpleProcessingTopology --jars 
> /tmp/spark_streaming-1.0-SNAPSHOT.jar --master spark://10.0.0.8:7077 
> <http://10.0.0.8:7077/> --verbose /tmp/spark_streaming-1.0-SNAPSHOT.jar 
> /tmp/streaming-benchmark.properties 
> where 10.0.0.8 is the IP address of the master node within the VNET. 
> 
> However, I keep getting the following exception while starting the streaming 
> application:
> 
> Driver stacktrace:
> at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
> at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1422)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
> 
> Caused by: java.lang.ClassNotFoundException: 
> topology.SimpleProcessingTopology$$anonfun$main$1$$anonfun$apply$1
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
> I've checked the content of the JAR using jar tvf and as you can see in the 
> output below, it does contain the class in question.
> 
> 1735 Wed Mar 01 12:29:20 UTC 2017 
> topology/SimpleProcessingTopology$$anonfun$main$1.class
>702 Wed Mar 01 12:29:20 UTC 2017 topology/SimpleProcessingTopology.class
>   2415 Wed Mar 01 12:29:20 UTC 2017 
> topology/SimpleProcessingTopology$$anonfun$main$1$$anonfun$apply$1$$anonfun$apply$2.class
>   2500 Wed Mar 01 12:29:20 UTC 2017 
> topology/SimpleProcessingTopology$$anonfun$main$1$$anonfun$apply$1.class
>   7045 Wed Mar 01 12:29:20 UTC 2017 topology/SimpleProcessingTopology$.class
> This exception has been caused due to the anonymous function of the 
> foreachPartition call:
> 
> rdd.foreachPartition(partition => {
>   val outTopic = props.getString("application.simple.kafka.out.topic")
>   val producer = new KafkaProducer[Array[Byte],Array[Byte]](kafkaParams)
>   partition.foreach(record => {
> val producerRecord = new ProducerRecord[Array[Byte], 
> Array[Byte]](outTopic, record.key(), record.value())
> producer.send(producerRecord)
>   })
>   producer.close()
> })
> Unfortunately, I am not able to find the root cause of this since so far. 
> Hence, I would appreciate if anyone could help me out fixing this issue.
> 



Re: Spark Streaming - java.lang.ClassNotFoundException Scala anonymous function

2017-03-01 Thread Sean Owen
What is the --jars you are submitting? You may have conflicting copies of
Spark classes that interfere.

On Wed, Mar 1, 2017, 14:20 Dominik Safaric  wrote:

> I've been trying to submit a Spark Streaming application using
> spark-submit to a cluster of mine consisting of a master and two worker
> nodes. The application has been written in Scala, and build using Maven.
> Importantly, the Maven build is configured to produce a fat JAR containing
> all dependencies. Furthermore, the JAR has been distributed to all of
> nodes. The streaming job has been submitted using the following command:
>
> bin/spark-submit --class topology.SimpleProcessingTopology --jars 
> /tmp/spark_streaming-1.0-SNAPSHOT.jar --master spark://10.0.0.8:7077 
> --verbose /tmp/spark_streaming-1.0-SNAPSHOT.jar 
> /tmp/streaming-benchmark.properties
>
> where 10.0.0.8 is the IP address of the master node within the VNET.
>
> However, I keep getting the following exception while starting the
> streaming application:
>
> Driver stacktrace:
> at 
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
> at 
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1422)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
> at 
> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
> Caused by: java.lang.ClassNotFoundException: 
> topology.SimpleProcessingTopology$$anonfun$main$1$$anonfun$apply$1
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at 
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
>
> I've checked the content of the JAR using jar tvf and as you can see in
> the output below, it does contain the class in question.
>
> 1735 Wed Mar 01 12:29:20 UTC 2017 
> topology/SimpleProcessingTopology$$anonfun$main$1.class
>702 Wed Mar 01 12:29:20 UTC 2017 topology/SimpleProcessingTopology.class
>   2415 Wed Mar 01 12:29:20 UTC 2017 
> topology/SimpleProcessingTopology$$anonfun$main$1$$anonfun$apply$1$$anonfun$apply$2.class
>   2500 Wed Mar 01 12:29:20 UTC 2017 
> topology/SimpleProcessingTopology$$anonfun$main$1$$anonfun$apply$1.class
>   7045 Wed Mar 01 12:29:20 UTC 2017 topology/SimpleProcessingTopology$.class
>
> This exception has been caused due to the anonymous function of the
> foreachPartition call:
>
> rdd.foreachPartition(partition => {
>   val outTopic = props.getString("application.simple.kafka.out.topic")
>   val producer = new KafkaProducer[Array[Byte],Array[Byte]](kafkaParams)
>   partition.foreach(record => {
> val producerRecord = new ProducerRecord[Array[Byte], 
> Array[Byte]](outTopic, record.key(), record.value())
> producer.send(producerRecord)
>   })
>   producer.close()
> })
>
> Unfortunately, I am not able to find the root cause of this since so far.
> Hence, I would appreciate if anyone could help me out fixing this issue.
>
>


Spark Streaming - java.lang.ClassNotFoundException Scala anonymous function

2017-03-01 Thread Dominik Safaric
I've been trying to submit a Spark Streaming application using spark-submit to 
a cluster of mine consisting of a master and two worker nodes. The application 
has been written in Scala, and build using Maven. Importantly, the Maven build 
is configured to produce a fat JAR containing all dependencies. Furthermore, 
the JAR has been distributed to all of nodes. The streaming job has been 
submitted using the following command:

bin/spark-submit --class topology.SimpleProcessingTopology --jars 
/tmp/spark_streaming-1.0-SNAPSHOT.jar --master spark://10.0.0.8:7077 --verbose 
/tmp/spark_streaming-1.0-SNAPSHOT.jar /tmp/streaming-benchmark.properties 
where 10.0.0.8 is the IP address of the master node within the VNET. 

However, I keep getting the following exception while starting the streaming 
application:

Driver stacktrace:
at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1435)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1423)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1422)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1422)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)
at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:802)

Caused by: java.lang.ClassNotFoundException: 
topology.SimpleProcessingTopology$$anonfun$main$1$$anonfun$apply$1
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
I've checked the content of the JAR using jar tvf and as you can see in the 
output below, it does contain the class in question.

1735 Wed Mar 01 12:29:20 UTC 2017 
topology/SimpleProcessingTopology$$anonfun$main$1.class
   702 Wed Mar 01 12:29:20 UTC 2017 topology/SimpleProcessingTopology.class
  2415 Wed Mar 01 12:29:20 UTC 2017 
topology/SimpleProcessingTopology$$anonfun$main$1$$anonfun$apply$1$$anonfun$apply$2.class
  2500 Wed Mar 01 12:29:20 UTC 2017 
topology/SimpleProcessingTopology$$anonfun$main$1$$anonfun$apply$1.class
  7045 Wed Mar 01 12:29:20 UTC 2017 topology/SimpleProcessingTopology$.class
This exception has been caused due to the anonymous function of the 
foreachPartition call:

rdd.foreachPartition(partition => {
  val outTopic = props.getString("application.simple.kafka.out.topic")
  val producer = new KafkaProducer[Array[Byte],Array[Byte]](kafkaParams)
  partition.foreach(record => {
val producerRecord = new ProducerRecord[Array[Byte], 
Array[Byte]](outTopic, record.key(), record.value())
producer.send(producerRecord)
  })
  producer.close()
})
Unfortunately, I am not able to find the root cause of this since so far. 
Hence, I would appreciate if anyone could help me out fixing this issue.



Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-26 Thread Marco Mistroni
Hi Raymond
 run this command and it should work, provided you have kafka setup a s
well  on localhost at port 2181

spark-submit --packages
org.apache.spark:spark-streaming-kafka-0-8_2.11:2.0.1  kafka_wordcount.py
localhost:2181 test

But i suggest, if you are a beginner, to use Spark examples' wordcount
instead, as i believe it reads from a local directory rather than setting
up kafka , which is an additional overhead you dont really need
If you want to go ahead with Kafka, the two links below can give you a start

https://dzone.com/articles/running-apache-kafka-on-windows-os   (i believe
similar setup can be used on Linux)
https://spark.apache.org/docs/latest/streaming-kafka-integration.html

kr




On Sat, Feb 25, 2017 at 11:12 PM, Marco Mistroni 
wrote:

> Hi I have a look. At GitHub project tomorrow and let u know. U have a py
> scripts to run and dependencies to specify.. pls check spark docs in
> meantime...I do all my coding in Scala and specify dependencies using
> --packages. ::.
> Kr
>
> On 25 Feb 2017 11:06 pm, "Raymond Xie"  wrote:
>
>> Thank you very much Marco,
>>
>> I am a beginner in this area, is it possible for you to show me what you
>> think the right script should be to get it executed in terminal?
>>
>>
>> **
>> *Sincerely yours,*
>>
>>
>> *Raymond*
>>
>> On Sat, Feb 25, 2017 at 6:00 PM, Marco Mistroni 
>> wrote:
>>
>>> Try to use --packages to include the jars. From error it seems it's
>>> looking for main class in jars but u r running a python script...
>>>
>>> On 25 Feb 2017 10:36 pm, "Raymond Xie"  wrote:
>>>
>>> That's right Anahita, however, the class name is not indicated in the
>>> original github project so I don't know what class should be used here. The
>>> github only says:
>>> and then run the example
>>> `$ bin/spark-submit --jars \
>>> external/kafka-assembly/target/scala-*/spark-streaming-kafka-assembly-*.jar
>>> \
>>> examples/src/main/python/streaming/kafka_wordcount.py \
>>> localhost:2181 test`
>>> """ Can anyone give any thought on how to find out? Thank you very much
>>> in advance.
>>>
>>>
>>> **
>>> *Sincerely yours,*
>>>
>>>
>>> *Raymond*
>>>
>>> On Sat, Feb 25, 2017 at 5:27 PM, Anahita Talebi <
>>> anahita.t.am...@gmail.com> wrote:
>>>
>>>> You're welcome.
>>>> You need to specify the class. I meant like that:
>>>>
>>>> spark-submit  /usr/hdp/2.5.0.0-1245/spark/l
>>>> ib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>> --class "give the name of the class"
>>>>
>>>>
>>>>
>>>> On Saturday, February 25, 2017, Raymond Xie 
>>>> wrote:
>>>>
>>>>> Thank you, it is still not working:
>>>>>
>>>>> [image: Inline image 1]
>>>>>
>>>>> By the way, here is the original source:
>>>>>
>>>>> https://github.com/apache/spark/blob/master/examples/src/mai
>>>>> n/python/streaming/kafka_wordcount.py
>>>>>
>>>>>
>>>>> **
>>>>> *Sincerely yours,*
>>>>>
>>>>>
>>>>> *Raymond*
>>>>>
>>>>> On Sat, Feb 25, 2017 at 4:48 PM, Anahita Talebi <
>>>>> anahita.t.am...@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I think if you remove --jars, it will work. Like:
>>>>>>
>>>>>> spark-submit  /usr/hdp/2.5.0.0-1245/spark/l
>>>>>> ib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>>>>
>>>>>>  I had the same problem before and solved it by removing --jars.
>>>>>>
>>>>>> Cheers,
>>>>>> Anahita
>>>>>>
>>>>>> On Saturday, February 25, 2017, Raymond Xie 
>>>>>> wrote:
>>>>>>
>>>>>>> I am doing a spark streaming on a hortonworks sandbox and am stuck
>>>>>>> here now, can anyone tell me what's wrong with the following code and 
>>>>>>> the
>>>>>>> exception it causes and how do I fix it? Thank you very much in adva

Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-25 Thread Marco Mistroni
Try to use --packages to include the jars. From error it seems it's looking
for main class in jars but u r running a python script...

On 25 Feb 2017 10:36 pm, "Raymond Xie"  wrote:

That's right Anahita, however, the class name is not indicated in the
original github project so I don't know what class should be used here. The
github only says:
and then run the example
`$ bin/spark-submit --jars \
external/kafka-assembly/target/scala-*/spark-streaming-kafka-assembly-*.jar
\
examples/src/main/python/streaming/kafka_wordcount.py \
localhost:2181 test`
""" Can anyone give any thought on how to find out? Thank you very much in
advance.


**
*Sincerely yours,*


*Raymond*

On Sat, Feb 25, 2017 at 5:27 PM, Anahita Talebi 
wrote:

> You're welcome.
> You need to specify the class. I meant like that:
>
> spark-submit  /usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.
> 0-1245-hadoop2.7.3.2.5.0.0-1245.jar --class "give the name of the class"
>
>
>
> On Saturday, February 25, 2017, Raymond Xie  wrote:
>
>> Thank you, it is still not working:
>>
>> [image: Inline image 1]
>>
>> By the way, here is the original source:
>>
>> https://github.com/apache/spark/blob/master/examples/src/mai
>> n/python/streaming/kafka_wordcount.py
>>
>>
>> **
>> *Sincerely yours,*
>>
>>
>> *Raymond*
>>
>> On Sat, Feb 25, 2017 at 4:48 PM, Anahita Talebi <
>> anahita.t.am...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I think if you remove --jars, it will work. Like:
>>>
>>> spark-submit  /usr/hdp/2.5.0.0-1245/spark/l
>>> ib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>
>>>  I had the same problem before and solved it by removing --jars.
>>>
>>> Cheers,
>>> Anahita
>>>
>>> On Saturday, February 25, 2017, Raymond Xie 
>>> wrote:
>>>
>>>> I am doing a spark streaming on a hortonworks sandbox and am stuck here
>>>> now, can anyone tell me what's wrong with the following code and the
>>>> exception it causes and how do I fix it? Thank you very much in advance.
>>>>
>>>> spark-submit --jars /usr/hdp/2.5.0.0-1245/spark/li
>>>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>>  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>>> /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>>>>
>>>> Error:
>>>> No main class set in JAR; please specify one with --class
>>>>
>>>>
>>>> spark-submit --class /usr/hdp/2.5.0.0-1245/spark/li
>>>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>>  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>>> /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>>>>
>>>> Error:
>>>> java.lang.ClassNotFoundException: /usr/hdp/2.5.0.0-1245/spark/li
>>>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>>
>>>> spark-submit --class  /usr/hdp/2.5.0.0-1245/kafka/l
>>>> ibs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>>> /usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.0
>>>> -1245-hadoop2.7.3.2.5.0.0-1245.jar  /root/hdp/kafka_wordcount.py
>>>> 192.168.128.119:2181 test
>>>>
>>>> Error:
>>>> java.lang.ClassNotFoundException: /usr/hdp/2.5.0.0-1245/kafka/li
>>>> bs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>>>
>>>> **
>>>> *Sincerely yours,*
>>>>
>>>>
>>>> *Raymond*
>>>>
>>>
>>


Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-25 Thread Raymond Xie
Thank you very much Marco,

I am a beginner in this area, is it possible for you to show me what you
think the right script should be to get it executed in terminal?


**
*Sincerely yours,*


*Raymond*

On Sat, Feb 25, 2017 at 6:00 PM, Marco Mistroni  wrote:

> Try to use --packages to include the jars. From error it seems it's
> looking for main class in jars but u r running a python script...
>
> On 25 Feb 2017 10:36 pm, "Raymond Xie"  wrote:
>
> That's right Anahita, however, the class name is not indicated in the
> original github project so I don't know what class should be used here. The
> github only says:
> and then run the example
> `$ bin/spark-submit --jars \
> external/kafka-assembly/target/scala-*/spark-streaming-kafka-assembly-*.jar
> \
> examples/src/main/python/streaming/kafka_wordcount.py \
> localhost:2181 test`
> """ Can anyone give any thought on how to find out? Thank you very much
> in advance.
>
>
> **
> *Sincerely yours,*
>
>
> *Raymond*
>
> On Sat, Feb 25, 2017 at 5:27 PM, Anahita Talebi  > wrote:
>
>> You're welcome.
>> You need to specify the class. I meant like that:
>>
>> spark-submit  /usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.
>> 0-1245-hadoop2.7.3.2.5.0.0-1245.jar --class "give the name of the class"
>>
>>
>>
>> On Saturday, February 25, 2017, Raymond Xie  wrote:
>>
>>> Thank you, it is still not working:
>>>
>>> [image: Inline image 1]
>>>
>>> By the way, here is the original source:
>>>
>>> https://github.com/apache/spark/blob/master/examples/src/mai
>>> n/python/streaming/kafka_wordcount.py
>>>
>>>
>>> **
>>> *Sincerely yours,*
>>>
>>>
>>> *Raymond*
>>>
>>> On Sat, Feb 25, 2017 at 4:48 PM, Anahita Talebi <
>>> anahita.t.am...@gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> I think if you remove --jars, it will work. Like:
>>>>
>>>> spark-submit  /usr/hdp/2.5.0.0-1245/spark/l
>>>> ib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>>
>>>>  I had the same problem before and solved it by removing --jars.
>>>>
>>>> Cheers,
>>>> Anahita
>>>>
>>>> On Saturday, February 25, 2017, Raymond Xie 
>>>> wrote:
>>>>
>>>>> I am doing a spark streaming on a hortonworks sandbox and am stuck
>>>>> here now, can anyone tell me what's wrong with the following code and the
>>>>> exception it causes and how do I fix it? Thank you very much in advance.
>>>>>
>>>>> spark-submit --jars /usr/hdp/2.5.0.0-1245/spark/li
>>>>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>>>  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>>>> /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>>>>>
>>>>> Error:
>>>>> No main class set in JAR; please specify one with --class
>>>>>
>>>>>
>>>>> spark-submit --class /usr/hdp/2.5.0.0-1245/spark/li
>>>>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>>>  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>>>> /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>>>>>
>>>>> Error:
>>>>> java.lang.ClassNotFoundException: /usr/hdp/2.5.0.0-1245/spark/li
>>>>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>>>
>>>>> spark-submit --class  /usr/hdp/2.5.0.0-1245/kafka/l
>>>>> ibs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>>>> /usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.0
>>>>> -1245-hadoop2.7.3.2.5.0.0-1245.jar  /root/hdp/kafka_wordcount.py
>>>>> 192.168.128.119:2181 test
>>>>>
>>>>> Error:
>>>>> java.lang.ClassNotFoundException: /usr/hdp/2.5.0.0-1245/kafka/li
>>>>> bs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>>>>
>>>>> **
>>>>> *Sincerely yours,*
>>>>>
>>>>>
>>>>> *Raymond*
>>>>>
>>>>
>>>
>
>


Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-25 Thread Raymond Xie
That's right Anahita, however, the class name is not indicated in the
original github project so I don't know what class should be used here. The
github only says:
and then run the example
`$ bin/spark-submit --jars \
external/kafka-assembly/target/scala-*/spark-streaming-kafka-assembly-*.jar
\
examples/src/main/python/streaming/kafka_wordcount.py \
localhost:2181 test`
""" Can anyone give any thought on how to find out? Thank you very much in
advance.


**
*Sincerely yours,*


*Raymond*

On Sat, Feb 25, 2017 at 5:27 PM, Anahita Talebi 
wrote:

> You're welcome.
> You need to specify the class. I meant like that:
>
> spark-submit  /usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.
> 0-1245-hadoop2.7.3.2.5.0.0-1245.jar --class "give the name of the class"
>
>
>
> On Saturday, February 25, 2017, Raymond Xie  wrote:
>
>> Thank you, it is still not working:
>>
>> [image: Inline image 1]
>>
>> By the way, here is the original source:
>>
>> https://github.com/apache/spark/blob/master/examples/src/mai
>> n/python/streaming/kafka_wordcount.py
>>
>>
>> **
>> *Sincerely yours,*
>>
>>
>> *Raymond*
>>
>> On Sat, Feb 25, 2017 at 4:48 PM, Anahita Talebi <
>> anahita.t.am...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I think if you remove --jars, it will work. Like:
>>>
>>> spark-submit  /usr/hdp/2.5.0.0-1245/spark/l
>>> ib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>
>>>  I had the same problem before and solved it by removing --jars.
>>>
>>> Cheers,
>>> Anahita
>>>
>>> On Saturday, February 25, 2017, Raymond Xie 
>>> wrote:
>>>
>>>> I am doing a spark streaming on a hortonworks sandbox and am stuck here
>>>> now, can anyone tell me what's wrong with the following code and the
>>>> exception it causes and how do I fix it? Thank you very much in advance.
>>>>
>>>> spark-submit --jars /usr/hdp/2.5.0.0-1245/spark/li
>>>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>>  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>>> /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>>>>
>>>> Error:
>>>> No main class set in JAR; please specify one with --class
>>>>
>>>>
>>>> spark-submit --class /usr/hdp/2.5.0.0-1245/spark/li
>>>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>>  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>>> /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>>>>
>>>> Error:
>>>> java.lang.ClassNotFoundException: /usr/hdp/2.5.0.0-1245/spark/li
>>>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>>
>>>> spark-submit --class  /usr/hdp/2.5.0.0-1245/kafka/l
>>>> ibs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>>> /usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.0
>>>> -1245-hadoop2.7.3.2.5.0.0-1245.jar  /root/hdp/kafka_wordcount.py
>>>> 192.168.128.119:2181 test
>>>>
>>>> Error:
>>>> java.lang.ClassNotFoundException: /usr/hdp/2.5.0.0-1245/kafka/li
>>>> bs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>>>
>>>> **
>>>> *Sincerely yours,*
>>>>
>>>>
>>>> *Raymond*
>>>>
>>>
>>


Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-25 Thread Anahita Talebi
You're welcome.
You need to specify the class. I meant like that:

spark-submit  /usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.
0-1245-hadoop2.7.3.2.5.0.0-1245.jar --class "give the name of the class"



On Saturday, February 25, 2017, Raymond Xie  wrote:

> Thank you, it is still not working:
>
> [image: Inline image 1]
>
> By the way, here is the original source:
>
> https://github.com/apache/spark/blob/master/examples/
> src/main/python/streaming/kafka_wordcount.py
>
>
> **
> *Sincerely yours,*
>
>
> *Raymond*
>
> On Sat, Feb 25, 2017 at 4:48 PM, Anahita Talebi  > wrote:
>
>> Hi,
>>
>> I think if you remove --jars, it will work. Like:
>>
>> spark-submit  /usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.
>> 0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>
>>  I had the same problem before and solved it by removing --jars.
>>
>> Cheers,
>> Anahita
>>
>> On Saturday, February 25, 2017, Raymond Xie > > wrote:
>>
>>> I am doing a spark streaming on a hortonworks sandbox and am stuck here
>>> now, can anyone tell me what's wrong with the following code and the
>>> exception it causes and how do I fix it? Thank you very much in advance.
>>>
>>> spark-submit --jars /usr/hdp/2.5.0.0-1245/spark/li
>>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>> /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>>>
>>> Error:
>>> No main class set in JAR; please specify one with --class
>>>
>>>
>>> spark-submit --class /usr/hdp/2.5.0.0-1245/spark/li
>>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>> /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>>>
>>> Error:
>>> java.lang.ClassNotFoundException: /usr/hdp/2.5.0.0-1245/spark/li
>>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>
>>> spark-submit --class  /usr/hdp/2.5.0.0-1245/kafka/l
>>> ibs/kafka-streams-0.10.0.2.5.0.0-1245.jar /usr/hdp/2.5.0.0-1245/spark/li
>>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>>  /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>>>
>>> Error:
>>> java.lang.ClassNotFoundException: /usr/hdp/2.5.0.0-1245/kafka/li
>>> bs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>>
>>> **
>>> *Sincerely yours,*
>>>
>>>
>>> *Raymond*
>>>
>>
>


Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-25 Thread Raymond Xie
Thank you, it is still not working:

[image: Inline image 1]

By the way, here is the original source:

https://github.com/apache/spark/blob/master/examples/src/main/python/streaming/kafka_wordcount.py


**
*Sincerely yours,*


*Raymond*

On Sat, Feb 25, 2017 at 4:48 PM, Anahita Talebi 
wrote:

> Hi,
>
> I think if you remove --jars, it will work. Like:
>
> spark-submit  /usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.
> 0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>
>  I had the same problem before and solved it by removing --jars.
>
> Cheers,
> Anahita
>
> On Saturday, February 25, 2017, Raymond Xie  wrote:
>
>> I am doing a spark streaming on a hortonworks sandbox and am stuck here
>> now, can anyone tell me what's wrong with the following code and the
>> exception it causes and how do I fix it? Thank you very much in advance.
>>
>> spark-submit --jars /usr/hdp/2.5.0.0-1245/spark/li
>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>> /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>>
>> Error:
>> No main class set in JAR; please specify one with --class
>>
>>
>> spark-submit --class /usr/hdp/2.5.0.0-1245/spark/li
>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>> /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>>
>> Error:
>> java.lang.ClassNotFoundException: /usr/hdp/2.5.0.0-1245/spark/li
>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>
>> spark-submit --class  /usr/hdp/2.5.0.0-1245/kafka/l
>> ibs/kafka-streams-0.10.0.2.5.0.0-1245.jar /usr/hdp/2.5.0.0-1245/spark/li
>> b/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>>  /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>>
>> Error:
>> java.lang.ClassNotFoundException: /usr/hdp/2.5.0.0-1245/kafka/li
>> bs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>>
>> **
>> *Sincerely yours,*
>>
>>
>> *Raymond*
>>
>


Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-25 Thread Anahita Talebi
Hi,

I think if you remove --jars, it will work. Like:

spark-submit  /usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.
0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar

 I had the same problem before and solved it by removing --jars.

Cheers,
Anahita

On Saturday, February 25, 2017, Raymond Xie  wrote:

> I am doing a spark streaming on a hortonworks sandbox and am stuck here
> now, can anyone tell me what's wrong with the following code and the
> exception it causes and how do I fix it? Thank you very much in advance.
>
> spark-submit --jars /usr/hdp/2.5.0.0-1245/spark/
> lib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
> /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>
> Error:
> No main class set in JAR; please specify one with --class
>
>
> spark-submit --class /usr/hdp/2.5.0.0-1245/spark/
> lib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
> /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>
> Error:
> java.lang.ClassNotFoundException: /usr/hdp/2.5.0.0-1245/spark/
> lib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>
> spark-submit --class  /usr/hdp/2.5.0.0-1245/kafka/
> libs/kafka-streams-0.10.0.2.5.0.0-1245.jar /usr/hdp/2.5.0.0-1245/spark/
> lib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
>  /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test
>
> Error:
> java.lang.ClassNotFoundException: /usr/hdp/2.5.0.0-1245/kafka/
> libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
>
> **
> *Sincerely yours,*
>
>
> *Raymond*
>


Re: No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-25 Thread yohann jardin
You should read (again?) the Spark documentation about submitting an 
application: http://spark.apache.org/docs/latest/submitting-applications.html

Try with the Pi computation example available with Spark.
For example:

./bin/spark-submit --class org.apache.spark.examples.SparkPi 
examples/jars/spark-examples*.jar

after --class you specify the path, in your provided jar, to the Main you want 
to run. You finish by specifying the jar that contains your main class.

Yohann Jardin

Le 2/25/2017 à 9:50 PM, Raymond Xie a écrit :
I am doing a spark streaming on a hortonworks sandbox and am stuck here now, 
can anyone tell me what's wrong with the following code and the exception it 
causes and how do I fix it? Thank you very much in advance.

spark-submit --jars 
/usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar 
/root/hdp/kafka_wordcount.py 192.168.128.119:2181<http://192.168.128.119:2181> 
test

Error:
No main class set in JAR; please specify one with --class


spark-submit --class 
/usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
  /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar 
/root/hdp/kafka_wordcount.py 192.168.128.119:2181<http://192.168.128.119:2181> 
test

Error:
java.lang.ClassNotFoundException: 
/usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar

spark-submit --class  
/usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar 
/usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
  /root/hdp/kafka_wordcount.py 
192.168.128.119:2181<http://192.168.128.119:2181> test

Error:
java.lang.ClassNotFoundException: 
/usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar


Sincerely yours,


Raymond



No main class set in JAR; please specify one with --class and java.lang.ClassNotFoundException

2017-02-25 Thread Raymond Xie
I am doing a spark streaming on a hortonworks sandbox and am stuck here
now, can anyone tell me what's wrong with the following code and the
exception it causes and how do I fix it? Thank you very much in advance.

spark-submit --jars
/usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
 /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
/root/hdp/kafka_wordcount.py 192.168.128.119:2181 test

Error:
No main class set in JAR; please specify one with --class


spark-submit --class
/usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
 /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
/root/hdp/kafka_wordcount.py 192.168.128.119:2181 test

Error:
java.lang.ClassNotFoundException:
/usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar

spark-submit --class
 /usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar
/usr/hdp/2.5.0.0-1245/spark/lib/spark-assembly-1.6.2.2.5.0.0-1245-hadoop2.7.3.2.5.0.0-1245.jar
 /root/hdp/kafka_wordcount.py 192.168.128.119:2181 test

Error:
java.lang.ClassNotFoundException:
/usr/hdp/2.5.0.0-1245/kafka/libs/kafka-streams-0.10.0.2.5.0.0-1245.jar

**
*Sincerely yours,*


*Raymond*


Re: java.lang.ClassNotFoundException: org.apache.spark.sql.SparkSession$ . Please Help!!!!!!!

2016-11-04 Thread shyla deshpande
I feel so good that Holden replied.

Yes, that was the problem. I was running from Intellij, I removed the
provided scope and works great.

Thanks a lot.

On Fri, Nov 4, 2016 at 2:05 PM, Holden Karau  wrote:

> It seems like you've marked the spark jars as provided, in this case they
> would only be provided you run your application with spark-submit or
> otherwise have Spark's JARs on your class path. How are you launching your
> application?
>
> On Fri, Nov 4, 2016 at 2:00 PM, shyla deshpande 
> wrote:
>
>> object App {
>>
>>
>>  import org.apache.spark.sql.functions._
>> import org.apache.spark.sql.SparkSession
>>
>>   def main(args : Array[String]) {
>> println( "Hello World!" )
>>   val sparkSession = SparkSession.builder.
>>   master("local")
>>   .appName("spark session example")
>>   .getOrCreate()
>>   }
>>
>> }
>>
>>
>> 
>>   1.8
>>   1.8
>>   UTF-8
>>   2.11.8
>>   2.11
>> 
>>
>> 
>>   
>> org.scala-lang
>> scala-library
>> ${scala.version}
>>   
>>
>>   
>>   org.apache.spark
>>   spark-core_2.11
>>   2.0.1
>>   provided
>>   
>>   
>>   org.apache.spark
>>   spark-sql_2.11
>>   2.0.1
>>   provided
>>   
>>
>>   
>> org.specs2
>> specs2-core_${scala.compat.version}
>> 2.4.16
>> test
>>   
>> 
>>
>> 
>>   src/main/scala
>> 
>>
>>
>
>
> --
> Cell : 425-233-8271
> Twitter: https://twitter.com/holdenkarau
>


Re: java.lang.ClassNotFoundException: org.apache.spark.sql.SparkSession$ . Please Help!!!!!!!

2016-11-04 Thread Holden Karau
It seems like you've marked the spark jars as provided, in this case they
would only be provided you run your application with spark-submit or
otherwise have Spark's JARs on your class path. How are you launching your
application?

On Fri, Nov 4, 2016 at 2:00 PM, shyla deshpande 
wrote:

> object App {
>
>
>  import org.apache.spark.sql.functions._
> import org.apache.spark.sql.SparkSession
>
>   def main(args : Array[String]) {
> println( "Hello World!" )
>   val sparkSession = SparkSession.builder.
>   master("local")
>   .appName("spark session example")
>   .getOrCreate()
>   }
>
> }
>
>
> 
>   1.8
>   1.8
>   UTF-8
>   2.11.8
>   2.11
> 
>
> 
>   
> org.scala-lang
> scala-library
> ${scala.version}
>   
>
>   
>   org.apache.spark
>   spark-core_2.11
>   2.0.1
>   provided
>   
>   
>   org.apache.spark
>   spark-sql_2.11
>   2.0.1
>   provided
>   
>
>   
> org.specs2
> specs2-core_${scala.compat.version}
> 2.4.16
> test
>   
> 
>
> 
>   src/main/scala
> 
>
>


-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau


java.lang.ClassNotFoundException: org.apache.spark.sql.SparkSession$ . Please Help!!!!!!!

2016-11-04 Thread shyla deshpande
object App {


 import org.apache.spark.sql.functions._
import org.apache.spark.sql.SparkSession

  def main(args : Array[String]) {
println( "Hello World!" )
  val sparkSession = SparkSession.builder.
  master("local")
  .appName("spark session example")
  .getOrCreate()
  }

}



  1.8
  1.8
  UTF-8
  2.11.8
  2.11



  
org.scala-lang
scala-library
${scala.version}
  

  
  org.apache.spark
  spark-core_2.11
  2.0.1
  provided
  
  
  org.apache.spark
  spark-sql_2.11
  2.0.1
  provided
  

  
org.specs2
specs2-core_${scala.compat.version}
2.4.16
test
  



  src/main/scala



Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
Use Spark XML version,0.3.3

com.databricks
spark-xml_2.10
0.3.3


On Fri, Jun 17, 2016 at 4:25 PM, VG  wrote:

> Hi Siva
>
> This is what i have for jars. Did you manage to run with these or
> different versions ?
>
>
> 
> org.apache.spark
> spark-core_2.10
> 1.6.1
> 
> 
> org.apache.spark
> spark-sql_2.10
> 1.6.1
> 
> 
> com.databricks
> spark-xml_2.10
> 0.2.0
> 
> 
> org.scala-lang
> scala-library
> 2.10.6
> 
>
> Thanks
> VG
>
>
> On Fri, Jun 17, 2016 at 4:16 PM, Siva A  wrote:
>
>> Hi Marco,
>>
>> I did run in IDE(Intellij) as well. It works fine.
>> VG, make sure the right jar is in classpath.
>>
>> --Siva
>>
>> On Fri, Jun 17, 2016 at 4:11 PM, Marco Mistroni 
>> wrote:
>>
>>> and  your eclipse path is correct?
>>> i suggest, as Siva did before, to build your jar and run it via
>>> spark-submit  by specifying the --packages option
>>> it's as simple as run this command
>>>
>>> spark-submit   --packages
>>> com.databricks:spark-xml_:   --class >> your class containing main> 
>>>
>>> Indeed, if you have only these lines to run, why dont you try them in
>>> spark-shell ?
>>>
>>> hth
>>>
>>> On Fri, Jun 17, 2016 at 11:32 AM, VG  wrote:
>>>
>>>> nopes. eclipse.
>>>>
>>>>
>>>> On Fri, Jun 17, 2016 at 3:58 PM, Siva A 
>>>> wrote:
>>>>
>>>>> If you are running from IDE, Are you using Intellij?
>>>>>
>>>>> On Fri, Jun 17, 2016 at 3:20 PM, Siva A 
>>>>> wrote:
>>>>>
>>>>>> Can you try to package as a jar and run using spark-submit
>>>>>>
>>>>>> Siva
>>>>>>
>>>>>> On Fri, Jun 17, 2016 at 3:17 PM, VG  wrote:
>>>>>>
>>>>>>> I am trying to run from IDE and everything else is working fine.
>>>>>>> I added spark-xml jar and now I ended up into this dependency
>>>>>>>
>>>>>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>>>>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>>>>>> scala/collection/GenTraversableOnce$class*
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.(ddl.scala:150)
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>>>>>> at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>>> at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>>> Caused by:* java.lang.ClassNotFoundException:
>>>>>>> scala.collection.GenTraversableOnce$class*
>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>>> ... 5 more
>>>>>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown
>>>>>>> hook
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni >>>>>> > wrote:
>>>>>>>
>>>>>>>> So you are using spark-submit  or spark-shell?
>>>>>>>>
>>>>>>>> you will need to launch either by passing --packages option (like
>>>>>>>> in the example below for spark-csv). you will need to iknow
>>>>>>>>
>>>>>>>> --packages com.databricks:spark-xml_:>>>>>>> version>
>>>>>>>>
>>>>>>>> hth
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG  wrote:
>>>>>>>>
>>>>>>>>> Apologies for that.
>>>>>>>>> I am trying to use spark-xml to l

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
It proceeded with the jars I mentioned.
However no data getting loaded into data frame...

sob sob :(

On Fri, Jun 17, 2016 at 4:25 PM, VG  wrote:

> Hi Siva
>
> This is what i have for jars. Did you manage to run with these or
> different versions ?
>
>
> 
> org.apache.spark
> spark-core_2.10
> 1.6.1
> 
> 
> org.apache.spark
> spark-sql_2.10
> 1.6.1
> 
> 
> com.databricks
> spark-xml_2.10
> 0.2.0
> 
> 
> org.scala-lang
> scala-library
> 2.10.6
> 
>
> Thanks
> VG
>
>
> On Fri, Jun 17, 2016 at 4:16 PM, Siva A  wrote:
>
>> Hi Marco,
>>
>> I did run in IDE(Intellij) as well. It works fine.
>> VG, make sure the right jar is in classpath.
>>
>> --Siva
>>
>> On Fri, Jun 17, 2016 at 4:11 PM, Marco Mistroni 
>> wrote:
>>
>>> and  your eclipse path is correct?
>>> i suggest, as Siva did before, to build your jar and run it via
>>> spark-submit  by specifying the --packages option
>>> it's as simple as run this command
>>>
>>> spark-submit   --packages
>>> com.databricks:spark-xml_:   --class >> your class containing main> 
>>>
>>> Indeed, if you have only these lines to run, why dont you try them in
>>> spark-shell ?
>>>
>>> hth
>>>
>>> On Fri, Jun 17, 2016 at 11:32 AM, VG  wrote:
>>>
>>>> nopes. eclipse.
>>>>
>>>>
>>>> On Fri, Jun 17, 2016 at 3:58 PM, Siva A 
>>>> wrote:
>>>>
>>>>> If you are running from IDE, Are you using Intellij?
>>>>>
>>>>> On Fri, Jun 17, 2016 at 3:20 PM, Siva A 
>>>>> wrote:
>>>>>
>>>>>> Can you try to package as a jar and run using spark-submit
>>>>>>
>>>>>> Siva
>>>>>>
>>>>>> On Fri, Jun 17, 2016 at 3:17 PM, VG  wrote:
>>>>>>
>>>>>>> I am trying to run from IDE and everything else is working fine.
>>>>>>> I added spark-xml jar and now I ended up into this dependency
>>>>>>>
>>>>>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>>>>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>>>>>> scala/collection/GenTraversableOnce$class*
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.(ddl.scala:150)
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>>>>>> at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>>> at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>>> Caused by:* java.lang.ClassNotFoundException:
>>>>>>> scala.collection.GenTraversableOnce$class*
>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>>> ... 5 more
>>>>>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown
>>>>>>> hook
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni >>>>>> > wrote:
>>>>>>>
>>>>>>>> So you are using spark-submit  or spark-shell?
>>>>>>>>
>>>>>>>> you will need to launch either by passing --packages option (like
>>>>>>>> in the example below for spark-csv). you will need to iknow
>>>>>>>>
>>>>>>>> --packages com.databricks:spark-xml_:>>>>>>> version>
>>>>>>>>
>>>>>>>> hth
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG  wrote:
>>>>>>>>
>>>>>>>>> Apologies for that.
>>>>>>>>&g

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
Hi Siva

This is what i have for jars. Did you manage to run with these or different
versions ?



org.apache.spark
spark-core_2.10
1.6.1


org.apache.spark
spark-sql_2.10
1.6.1


com.databricks
spark-xml_2.10
0.2.0


org.scala-lang
scala-library
2.10.6


Thanks
VG


On Fri, Jun 17, 2016 at 4:16 PM, Siva A  wrote:

> Hi Marco,
>
> I did run in IDE(Intellij) as well. It works fine.
> VG, make sure the right jar is in classpath.
>
> --Siva
>
> On Fri, Jun 17, 2016 at 4:11 PM, Marco Mistroni 
> wrote:
>
>> and  your eclipse path is correct?
>> i suggest, as Siva did before, to build your jar and run it via
>> spark-submit  by specifying the --packages option
>> it's as simple as run this command
>>
>> spark-submit   --packages
>> com.databricks:spark-xml_:   --class > your class containing main> 
>>
>> Indeed, if you have only these lines to run, why dont you try them in
>> spark-shell ?
>>
>> hth
>>
>> On Fri, Jun 17, 2016 at 11:32 AM, VG  wrote:
>>
>>> nopes. eclipse.
>>>
>>>
>>> On Fri, Jun 17, 2016 at 3:58 PM, Siva A 
>>> wrote:
>>>
>>>> If you are running from IDE, Are you using Intellij?
>>>>
>>>> On Fri, Jun 17, 2016 at 3:20 PM, Siva A 
>>>> wrote:
>>>>
>>>>> Can you try to package as a jar and run using spark-submit
>>>>>
>>>>> Siva
>>>>>
>>>>> On Fri, Jun 17, 2016 at 3:17 PM, VG  wrote:
>>>>>
>>>>>> I am trying to run from IDE and everything else is working fine.
>>>>>> I added spark-xml jar and now I ended up into this dependency
>>>>>>
>>>>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>>>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>>>>> scala/collection/GenTraversableOnce$class*
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.(ddl.scala:150)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>>>>> at
>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>> at
>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>> Caused by:* java.lang.ClassNotFoundException:
>>>>>> scala.collection.GenTraversableOnce$class*
>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>> ... 5 more
>>>>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown
>>>>>> hook
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni 
>>>>>> wrote:
>>>>>>
>>>>>>> So you are using spark-submit  or spark-shell?
>>>>>>>
>>>>>>> you will need to launch either by passing --packages option (like in
>>>>>>> the example below for spark-csv). you will need to iknow
>>>>>>>
>>>>>>> --packages com.databricks:spark-xml_:
>>>>>>>
>>>>>>> hth
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG  wrote:
>>>>>>>
>>>>>>>> Apologies for that.
>>>>>>>> I am trying to use spark-xml to load data of a xml file.
>>>>>>>>
>>>>>>>> here is the exception
>>>>>>>>
>>>>>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed
>>>>>>>> to find data source: org.apache.spark.xml. Please find packages at
>>>>>>>> http://spark-packages.org
>>>>>>>> at
>>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(Reso

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
Hi Marco,

I did run in IDE(Intellij) as well. It works fine.
VG, make sure the right jar is in classpath.

--Siva

On Fri, Jun 17, 2016 at 4:11 PM, Marco Mistroni  wrote:

> and  your eclipse path is correct?
> i suggest, as Siva did before, to build your jar and run it via
> spark-submit  by specifying the --packages option
> it's as simple as run this command
>
> spark-submit   --packages
> com.databricks:spark-xml_:   --class  your class containing main> 
>
> Indeed, if you have only these lines to run, why dont you try them in
> spark-shell ?
>
> hth
>
> On Fri, Jun 17, 2016 at 11:32 AM, VG  wrote:
>
>> nopes. eclipse.
>>
>>
>> On Fri, Jun 17, 2016 at 3:58 PM, Siva A  wrote:
>>
>>> If you are running from IDE, Are you using Intellij?
>>>
>>> On Fri, Jun 17, 2016 at 3:20 PM, Siva A 
>>> wrote:
>>>
>>>> Can you try to package as a jar and run using spark-submit
>>>>
>>>> Siva
>>>>
>>>> On Fri, Jun 17, 2016 at 3:17 PM, VG  wrote:
>>>>
>>>>> I am trying to run from IDE and everything else is working fine.
>>>>> I added spark-xml jar and now I ended up into this dependency
>>>>>
>>>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>>>> scala/collection/GenTraversableOnce$class*
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.(ddl.scala:150)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>> Caused by:* java.lang.ClassNotFoundException:
>>>>> scala.collection.GenTraversableOnce$class*
>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>> ... 5 more
>>>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni 
>>>>> wrote:
>>>>>
>>>>>> So you are using spark-submit  or spark-shell?
>>>>>>
>>>>>> you will need to launch either by passing --packages option (like in
>>>>>> the example below for spark-csv). you will need to iknow
>>>>>>
>>>>>> --packages com.databricks:spark-xml_:
>>>>>>
>>>>>> hth
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG  wrote:
>>>>>>
>>>>>>> Apologies for that.
>>>>>>> I am trying to use spark-xml to load data of a xml file.
>>>>>>>
>>>>>>> here is the exception
>>>>>>>
>>>>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed
>>>>>>> to find data source: org.apache.spark.xml. Please find packages at
>>>>>>> http://spark-packages.org
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>>>>>> at
>>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>>>>>> at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>>> at
>>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>> org.apache.spark.xml.DefaultSource
>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoade

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Marco Mistroni
and  your eclipse path is correct?
i suggest, as Siva did before, to build your jar and run it via
spark-submit  by specifying the --packages option
it's as simple as run this command

spark-submit   --packages
com.databricks:spark-xml_:   --class  

Indeed, if you have only these lines to run, why dont you try them in
spark-shell ?

hth

On Fri, Jun 17, 2016 at 11:32 AM, VG  wrote:

> nopes. eclipse.
>
>
> On Fri, Jun 17, 2016 at 3:58 PM, Siva A  wrote:
>
>> If you are running from IDE, Are you using Intellij?
>>
>> On Fri, Jun 17, 2016 at 3:20 PM, Siva A  wrote:
>>
>>> Can you try to package as a jar and run using spark-submit
>>>
>>> Siva
>>>
>>> On Fri, Jun 17, 2016 at 3:17 PM, VG  wrote:
>>>
>>>> I am trying to run from IDE and everything else is working fine.
>>>> I added spark-xml jar and now I ended up into this dependency
>>>>
>>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>>> scala/collection/GenTraversableOnce$class*
>>>> at
>>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.(ddl.scala:150)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>> Caused by:* java.lang.ClassNotFoundException:
>>>> scala.collection.GenTraversableOnce$class*
>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>> ... 5 more
>>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook
>>>>
>>>>
>>>>
>>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni 
>>>> wrote:
>>>>
>>>>> So you are using spark-submit  or spark-shell?
>>>>>
>>>>> you will need to launch either by passing --packages option (like in
>>>>> the example below for spark-csv). you will need to iknow
>>>>>
>>>>> --packages com.databricks:spark-xml_:
>>>>>
>>>>> hth
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG  wrote:
>>>>>
>>>>>> Apologies for that.
>>>>>> I am trying to use spark-xml to load data of a xml file.
>>>>>>
>>>>>> here is the exception
>>>>>>
>>>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed
>>>>>> to find data source: org.apache.spark.xml. Please find packages at
>>>>>> http://spark-packages.org
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>>>>> at
>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>> at
>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>> org.apache.spark.xml.DefaultSource
>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala

Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
Try to import the class and see if you are getting compilation error

import com.databricks.spark.xml

Siva

On Fri, Jun 17, 2016 at 4:02 PM, VG  wrote:

> nopes. eclipse.
>
>
> On Fri, Jun 17, 2016 at 3:58 PM, Siva A  wrote:
>
>> If you are running from IDE, Are you using Intellij?
>>
>> On Fri, Jun 17, 2016 at 3:20 PM, Siva A  wrote:
>>
>>> Can you try to package as a jar and run using spark-submit
>>>
>>> Siva
>>>
>>> On Fri, Jun 17, 2016 at 3:17 PM, VG  wrote:
>>>
>>>> I am trying to run from IDE and everything else is working fine.
>>>> I added spark-xml jar and now I ended up into this dependency
>>>>
>>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>>> scala/collection/GenTraversableOnce$class*
>>>> at
>>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.(ddl.scala:150)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>> Caused by:* java.lang.ClassNotFoundException:
>>>> scala.collection.GenTraversableOnce$class*
>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>> ... 5 more
>>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook
>>>>
>>>>
>>>>
>>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni 
>>>> wrote:
>>>>
>>>>> So you are using spark-submit  or spark-shell?
>>>>>
>>>>> you will need to launch either by passing --packages option (like in
>>>>> the example below for spark-csv). you will need to iknow
>>>>>
>>>>> --packages com.databricks:spark-xml_:
>>>>>
>>>>> hth
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG  wrote:
>>>>>
>>>>>> Apologies for that.
>>>>>> I am trying to use spark-xml to load data of a xml file.
>>>>>>
>>>>>> here is the exception
>>>>>>
>>>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed
>>>>>> to find data source: org.apache.spark.xml. Please find packages at
>>>>>> http://spark-packages.org
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>>>>> at
>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>>> at
>>>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>> org.apache.spark.xml.DefaultSource
>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>>> at scala.util.Try$.apply(Try.scala:192)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>>> at scala.util.Try.orElse(Try.scala:84)
>>>>>> at
>>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>>>>>> ... 4 more
>>>>>>
>>>>>> Code
>>>>>> SQLContext sqlContext = new SQLContext(sc);
>>>>>> DataFrame df = sqlContext.read()
>>>>>> .format("org.apache.spark.xml")
>>>>>> .option("rowTag", "row")
>>>>>> .load("A.xml");
>>>>>>
>>>>>> Any suggestions please ..
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni 
>>>>>> wrote:
>>>>>>
>>>>>>> too little info
>>>>>>> it'll help if you can post the exception and show your sbt file (if
>>>>>>> you are using sbt), and provide minimal details on what you are doing
>>>>>>> kr
>>>>>>>
>>>>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG  wrote:
>>>>>>>
>>>>>>>> Failed to find data source: com.databricks.spark.xml
>>>>>>>>
>>>>>>>> Any suggestions to resolve this
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
nopes. eclipse.


On Fri, Jun 17, 2016 at 3:58 PM, Siva A  wrote:

> If you are running from IDE, Are you using Intellij?
>
> On Fri, Jun 17, 2016 at 3:20 PM, Siva A  wrote:
>
>> Can you try to package as a jar and run using spark-submit
>>
>> Siva
>>
>> On Fri, Jun 17, 2016 at 3:17 PM, VG  wrote:
>>
>>> I am trying to run from IDE and everything else is working fine.
>>> I added spark-xml jar and now I ended up into this dependency
>>>
>>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>>> scala/collection/GenTraversableOnce$class*
>>> at
>>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.(ddl.scala:150)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>> Caused by:* java.lang.ClassNotFoundException:
>>> scala.collection.GenTraversableOnce$class*
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>> ... 5 more
>>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook
>>>
>>>
>>>
>>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni 
>>> wrote:
>>>
>>>> So you are using spark-submit  or spark-shell?
>>>>
>>>> you will need to launch either by passing --packages option (like in
>>>> the example below for spark-csv). you will need to iknow
>>>>
>>>> --packages com.databricks:spark-xml_:
>>>>
>>>> hth
>>>>
>>>>
>>>>
>>>> On Fri, Jun 17, 2016 at 10:20 AM, VG  wrote:
>>>>
>>>>> Apologies for that.
>>>>> I am trying to use spark-xml to load data of a xml file.
>>>>>
>>>>> here is the exception
>>>>>
>>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
>>>>> find data source: org.apache.spark.xml. Please find packages at
>>>>> http://spark-packages.org
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>> org.apache.spark.xml.DefaultSource
>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>>> at scala.util.Try$.apply(Try.scala:192)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>>> at scala.util.Try.orElse(Try.scala:84)
>>>>> at
>>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>>>>> ... 4 more
>>>>>
>>>>> Code
>>>>> SQLContext sqlContext = new SQLContext(sc);
>>>>> DataFrame df = sqlContext.read()
>>>>> .format("org.apache.spark.xml")
>>>>> .option("rowTag", "row")
>>>>> .load("A.xml");
>>>>>
>>>>> Any suggestions please ..
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni 
>>>>> wrote:
>>>>>
>>>>>> too little info
>>>>>> it'll help if you can post the exception and show your sbt file (if
>>>>>> you are using sbt), and provide minimal details on what you are doing
>>>>>> kr
>>>>>>
>>>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG  wrote:
>>>>>>
>>>>>>> Failed to find data source: com.databricks.spark.xml
>>>>>>>
>>>>>>> Any suggestions to resolve this
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
If you are running from IDE, Are you using Intellij?

On Fri, Jun 17, 2016 at 3:20 PM, Siva A  wrote:

> Can you try to package as a jar and run using spark-submit
>
> Siva
>
> On Fri, Jun 17, 2016 at 3:17 PM, VG  wrote:
>
>> I am trying to run from IDE and everything else is working fine.
>> I added spark-xml jar and now I ended up into this dependency
>>
>> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
>> Exception in thread "main" *java.lang.NoClassDefFoundError:
>> scala/collection/GenTraversableOnce$class*
>> at
>> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.(ddl.scala:150)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>> Caused by:* java.lang.ClassNotFoundException:
>> scala.collection.GenTraversableOnce$class*
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> ... 5 more
>> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook
>>
>>
>>
>> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni 
>> wrote:
>>
>>> So you are using spark-submit  or spark-shell?
>>>
>>> you will need to launch either by passing --packages option (like in the
>>> example below for spark-csv). you will need to iknow
>>>
>>> --packages com.databricks:spark-xml_:
>>>
>>> hth
>>>
>>>
>>>
>>> On Fri, Jun 17, 2016 at 10:20 AM, VG  wrote:
>>>
>>>> Apologies for that.
>>>> I am trying to use spark-xml to load data of a xml file.
>>>>
>>>> here is the exception
>>>>
>>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
>>>> find data source: org.apache.spark.xml. Please find packages at
>>>> http://spark-packages.org
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>>> Caused by: java.lang.ClassNotFoundException:
>>>> org.apache.spark.xml.DefaultSource
>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>>> at scala.util.Try$.apply(Try.scala:192)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>>> at scala.util.Try.orElse(Try.scala:84)
>>>> at
>>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>>>> ... 4 more
>>>>
>>>> Code
>>>> SQLContext sqlContext = new SQLContext(sc);
>>>> DataFrame df = sqlContext.read()
>>>> .format("org.apache.spark.xml")
>>>> .option("rowTag", "row")
>>>> .load("A.xml");
>>>>
>>>> Any suggestions please ..
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni 
>>>> wrote:
>>>>
>>>>> too little info
>>>>> it'll help if you can post the exception and show your sbt file (if
>>>>> you are using sbt), and provide minimal details on what you are doing
>>>>> kr
>>>>>
>>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG  wrote:
>>>>>
>>>>>> Failed to find data source: com.databricks.spark.xml
>>>>>>
>>>>>> Any suggestions to resolve this
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
Can you try to package as a jar and run using spark-submit

Siva

On Fri, Jun 17, 2016 at 3:17 PM, VG  wrote:

> I am trying to run from IDE and everything else is working fine.
> I added spark-xml jar and now I ended up into this dependency
>
> 6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
> Exception in thread "main" *java.lang.NoClassDefFoundError:
> scala/collection/GenTraversableOnce$class*
> at
> org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.(ddl.scala:150)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
> Caused by:* java.lang.ClassNotFoundException:
> scala.collection.GenTraversableOnce$class*
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 5 more
> 16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook
>
>
>
> On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni 
> wrote:
>
>> So you are using spark-submit  or spark-shell?
>>
>> you will need to launch either by passing --packages option (like in the
>> example below for spark-csv). you will need to iknow
>>
>> --packages com.databricks:spark-xml_:
>>
>> hth
>>
>>
>>
>> On Fri, Jun 17, 2016 at 10:20 AM, VG  wrote:
>>
>>> Apologies for that.
>>> I am trying to use spark-xml to load data of a xml file.
>>>
>>> here is the exception
>>>
>>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>>> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
>>> find data source: org.apache.spark.xml. Please find packages at
>>> http://spark-packages.org
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>>> Caused by: java.lang.ClassNotFoundException:
>>> org.apache.spark.xml.DefaultSource
>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>>> at scala.util.Try$.apply(Try.scala:192)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>>> at scala.util.Try.orElse(Try.scala:84)
>>> at
>>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>>> ... 4 more
>>>
>>> Code
>>> SQLContext sqlContext = new SQLContext(sc);
>>> DataFrame df = sqlContext.read()
>>> .format("org.apache.spark.xml")
>>> .option("rowTag", "row")
>>> .load("A.xml");
>>>
>>> Any suggestions please ..
>>>
>>>
>>>
>>>
>>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni 
>>> wrote:
>>>
>>>> too little info
>>>> it'll help if you can post the exception and show your sbt file (if you
>>>> are using sbt), and provide minimal details on what you are doing
>>>> kr
>>>>
>>>> On Fri, Jun 17, 2016 at 10:08 AM, VG  wrote:
>>>>
>>>>> Failed to find data source: com.databricks.spark.xml
>>>>>
>>>>> Any suggestions to resolve this
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>


Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
I am trying to run from IDE and everything else is working fine.
I added spark-xml jar and now I ended up into this dependency

6/06/17 15:15:57 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" *java.lang.NoClassDefFoundError:
scala/collection/GenTraversableOnce$class*
at
org.apache.spark.sql.execution.datasources.CaseInsensitiveMap.(ddl.scala:150)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:154)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
Caused by:* java.lang.ClassNotFoundException:
scala.collection.GenTraversableOnce$class*
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 5 more
16/06/17 15:15:58 INFO SparkContext: Invoking stop() from shutdown hook



On Fri, Jun 17, 2016 at 2:59 PM, Marco Mistroni  wrote:

> So you are using spark-submit  or spark-shell?
>
> you will need to launch either by passing --packages option (like in the
> example below for spark-csv). you will need to iknow
>
> --packages com.databricks:spark-xml_:
>
> hth
>
>
>
> On Fri, Jun 17, 2016 at 10:20 AM, VG  wrote:
>
>> Apologies for that.
>> I am trying to use spark-xml to load data of a xml file.
>>
>> here is the exception
>>
>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
>> find data source: org.apache.spark.xml. Please find packages at
>> http://spark-packages.org
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.spark.xml.DefaultSource
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>> at scala.util.Try$.apply(Try.scala:192)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>> at scala.util.Try.orElse(Try.scala:84)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>> ... 4 more
>>
>> Code
>> SQLContext sqlContext = new SQLContext(sc);
>> DataFrame df = sqlContext.read()
>> .format("org.apache.spark.xml")
>> .option("rowTag", "row")
>> .load("A.xml");
>>
>> Any suggestions please ..
>>
>>
>>
>>
>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni 
>> wrote:
>>
>>> too little info
>>> it'll help if you can post the exception and show your sbt file (if you
>>> are using sbt), and provide minimal details on what you are doing
>>> kr
>>>
>>> On Fri, Jun 17, 2016 at 10:08 AM, VG  wrote:
>>>
>>>> Failed to find data source: com.databricks.spark.xml
>>>>
>>>> Any suggestions to resolve this
>>>>
>>>>
>>>>
>>>
>>
>


Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
Hi Siva,

I still get a similar exception (See the highlighted section - It is
looking for DataSource)
16/06/17 15:11:37 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.lang.ClassNotFoundException: Failed to find
data source: xml. Please find packages at http://spark-packages.org
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
*Caused by: java.lang.ClassNotFoundException: xml.DefaultSource*
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
at scala.util.Try$.apply(Try.scala:192)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
at scala.util.Try.orElse(Try.scala:84)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
... 4 more
16/06/17 15:11:38 INFO SparkContext: Invoking stop() from shutdown hook



On Fri, Jun 17, 2016 at 2:56 PM, Siva A  wrote:

> Just try to use "xml" as format like below,
>
> SQLContext sqlContext = new SQLContext(sc);
> DataFrame df = sqlContext.read()
> .format("xml")
> .option("rowTag", "row")
> .load("A.xml");
>
> FYR: https://github.com/databricks/spark-xml
>
> --Siva
>
> On Fri, Jun 17, 2016 at 2:50 PM, VG  wrote:
>
>> Apologies for that.
>> I am trying to use spark-xml to load data of a xml file.
>>
>> here is the exception
>>
>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
>> find data source: org.apache.spark.xml. Please find packages at
>> http://spark-packages.org
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.spark.xml.DefaultSource
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>> at scala.util.Try$.apply(Try.scala:192)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>> at scala.util.Try.orElse(Try.scala:84)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>> ... 4 more
>>
>> Code
>> SQLContext sqlContext = new SQLContext(sc);
>> DataFrame df = sqlContext.read()
>> .format("org.apache.spark.xml")
>> .option("rowTag", "row")
>> .load("A.xml");
>>
>> Any suggestions please ..
>>
>>
>>
>>
>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni 
>> wrote:
>>
>>> too little info
>>> it'll help if you can post the exception and show your sbt file (if you
>>> are using sbt), and provide minimal details on what you are doing
>>> kr
>>>
>>> On Fri, Jun 17, 2016 at 10:08 AM, VG  wrote:
>>>
>>>> Failed to find data source: com.databricks.spark.xml
>>>>
>>>> Any suggestions to resolve this
>>>>
>>>>
>>>>
>>>
>>
>


Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Marco Mistroni
So you are using spark-submit  or spark-shell?

you will need to launch either by passing --packages option (like in the
example below for spark-csv). you will need to iknow

--packages com.databricks:spark-xml_:

hth



On Fri, Jun 17, 2016 at 10:20 AM, VG  wrote:

> Apologies for that.
> I am trying to use spark-xml to load data of a xml file.
>
> here is the exception
>
> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
> find data source: org.apache.spark.xml. Please find packages at
> http://spark-packages.org
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.spark.xml.DefaultSource
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
> at scala.util.Try$.apply(Try.scala:192)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
> at scala.util.Try.orElse(Try.scala:84)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
> ... 4 more
>
> Code
> SQLContext sqlContext = new SQLContext(sc);
> DataFrame df = sqlContext.read()
> .format("org.apache.spark.xml")
> .option("rowTag", "row")
> .load("A.xml");
>
> Any suggestions please ..
>
>
>
>
> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni 
> wrote:
>
>> too little info
>> it'll help if you can post the exception and show your sbt file (if you
>> are using sbt), and provide minimal details on what you are doing
>> kr
>>
>> On Fri, Jun 17, 2016 at 10:08 AM, VG  wrote:
>>
>>> Failed to find data source: com.databricks.spark.xml
>>>
>>> Any suggestions to resolve this
>>>
>>>
>>>
>>
>


Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
If its not working,

Add the package list while executing spark-submit/spark-shell like below

$SPARK_HOME/bin/spark-shell --packages com.databricks:spark-xml_2.10:0.3.3

$SPARK_HOME/bin/spark-submit --packages com.databricks:spark-xml_2.10:0.3.3



On Fri, Jun 17, 2016 at 2:56 PM, Siva A  wrote:

> Just try to use "xml" as format like below,
>
> SQLContext sqlContext = new SQLContext(sc);
> DataFrame df = sqlContext.read()
> .format("xml")
> .option("rowTag", "row")
> .load("A.xml");
>
> FYR: https://github.com/databricks/spark-xml
>
> --Siva
>
> On Fri, Jun 17, 2016 at 2:50 PM, VG  wrote:
>
>> Apologies for that.
>> I am trying to use spark-xml to load data of a xml file.
>>
>> here is the exception
>>
>> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
>> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
>> find data source: org.apache.spark.xml. Please find packages at
>> http://spark-packages.org
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
>> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
>> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.spark.xml.DefaultSource
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
>> at scala.util.Try$.apply(Try.scala:192)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
>> at scala.util.Try.orElse(Try.scala:84)
>> at
>> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
>> ... 4 more
>>
>> Code
>> SQLContext sqlContext = new SQLContext(sc);
>> DataFrame df = sqlContext.read()
>> .format("org.apache.spark.xml")
>> .option("rowTag", "row")
>> .load("A.xml");
>>
>> Any suggestions please ..
>>
>>
>>
>>
>> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni 
>> wrote:
>>
>>> too little info
>>> it'll help if you can post the exception and show your sbt file (if you
>>> are using sbt), and provide minimal details on what you are doing
>>> kr
>>>
>>> On Fri, Jun 17, 2016 at 10:08 AM, VG  wrote:
>>>
>>>> Failed to find data source: com.databricks.spark.xml
>>>>
>>>> Any suggestions to resolve this
>>>>
>>>>
>>>>
>>>
>>
>


Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Siva A
Just try to use "xml" as format like below,

SQLContext sqlContext = new SQLContext(sc);
DataFrame df = sqlContext.read()
.format("xml")
.option("rowTag", "row")
.load("A.xml");

FYR: https://github.com/databricks/spark-xml

--Siva

On Fri, Jun 17, 2016 at 2:50 PM, VG  wrote:

> Apologies for that.
> I am trying to use spark-xml to load data of a xml file.
>
> here is the exception
>
> 16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
> Exception in thread "main" java.lang.ClassNotFoundException: Failed to
> find data source: org.apache.spark.xml. Please find packages at
> http://spark-packages.org
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
> at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
> at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.spark.xml.DefaultSource
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
> at scala.util.Try$.apply(Try.scala:192)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
> at scala.util.Try.orElse(Try.scala:84)
> at
> org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
> ... 4 more
>
> Code
> SQLContext sqlContext = new SQLContext(sc);
> DataFrame df = sqlContext.read()
> .format("org.apache.spark.xml")
> .option("rowTag", "row")
> .load("A.xml");
>
> Any suggestions please ..
>
>
>
>
> On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni 
> wrote:
>
>> too little info
>> it'll help if you can post the exception and show your sbt file (if you
>> are using sbt), and provide minimal details on what you are doing
>> kr
>>
>> On Fri, Jun 17, 2016 at 10:08 AM, VG  wrote:
>>
>>> Failed to find data source: com.databricks.spark.xml
>>>
>>> Any suggestions to resolve this
>>>
>>>
>>>
>>
>


Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
Apologies for that.
I am trying to use spark-xml to load data of a xml file.

here is the exception

16/06/17 14:49:04 INFO BlockManagerMaster: Registered BlockManager
Exception in thread "main" java.lang.ClassNotFoundException: Failed to find
data source: org.apache.spark.xml. Please find packages at
http://spark-packages.org
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:77)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:102)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
at org.ariba.spark.PostsProcessing.main(PostsProcessing.java:19)
Caused by: java.lang.ClassNotFoundException:
org.apache.spark.xml.DefaultSource
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4$$anonfun$apply$1.apply(ResolvedDataSource.scala:62)
at scala.util.Try$.apply(Try.scala:192)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$$anonfun$4.apply(ResolvedDataSource.scala:62)
at scala.util.Try.orElse(Try.scala:84)
at
org.apache.spark.sql.execution.datasources.ResolvedDataSource$.lookupDataSource(ResolvedDataSource.scala:62)
... 4 more

Code
SQLContext sqlContext = new SQLContext(sc);
DataFrame df = sqlContext.read()
.format("org.apache.spark.xml")
.option("rowTag", "row")
.load("A.xml");

Any suggestions please ..




On Fri, Jun 17, 2016 at 2:42 PM, Marco Mistroni  wrote:

> too little info
> it'll help if you can post the exception and show your sbt file (if you
> are using sbt), and provide minimal details on what you are doing
> kr
>
> On Fri, Jun 17, 2016 at 10:08 AM, VG  wrote:
>
>> Failed to find data source: com.databricks.spark.xml
>>
>> Any suggestions to resolve this
>>
>>
>>
>


Re: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread Marco Mistroni
too little info
it'll help if you can post the exception and show your sbt file (if you are
using sbt), and provide minimal details on what you are doing
kr

On Fri, Jun 17, 2016 at 10:08 AM, VG  wrote:

> Failed to find data source: com.databricks.spark.xml
>
> Any suggestions to resolve this
>
>
>


java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml. Please find packages at http://spark-packages.org

2016-06-17 Thread VG
Failed to find data source: com.databricks.spark.xml

Any suggestions to resolve this


Re: ERROR TaskResultGetter: Exception while getting task result java.io.IOException: java.lang.ClassNotFoundException: scala.Some

2016-06-16 Thread Jacek Laskowski
Hi,

Why do you provided spark-core while the others are non-provided? How do
you assemble the app? How do you submit it for execution? What's the
deployment environment?

More info...more info...

Jacek
On 15 Jun 2016 10:26 p.m., "S Sarkar"  wrote:

Hello,

I built package for a spark application with the following sbt file:

name := "Simple Project"

version := "1.0"

scalaVersion := "2.10.3"

libraryDependencies ++= Seq(
  "org.apache.spark"  %% "spark-core"  % "1.4.0" % "provided",
  "org.apache.spark"  %% "spark-mllib" % "1.4.0",
  "org.apache.spark"  %% "spark-sql"   % "1.4.0",
  "org.apache.spark"  %% "spark-sql"   % "1.4.0"
  )
resolvers += "Akka Repository" at "http://repo.akka.io/releases/";

I am getting TaskResultGetter error with ClassNotFoundException for
scala.Some .

Can I please get some help how to fix it?

Thanks,
S. Sarkar



--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/ERROR-TaskResultGetter-Exception-while-getting-task-result-java-io-IOException-java-lang-ClassNotFoue-tp27178.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org


Re: [scala-user] ERROR TaskResultGetter: Exception while getting task result java.io.IOException: java.lang.ClassNotFoundException: scala.Some

2016-06-16 Thread Oliver Ruebenacker
 Hello,

  It would be useful to see the code that throws the exception. It probably
means that the Scala standard library is not being uploaded to the
executers. Try adding the Scala standard library to the SBT file
("org.scala-lang" % "scala-library" % "2.10.3"), or check your
configuration. Also, did you launch using spark-submit?

 Best, Oliver

On Wed, Jun 15, 2016 at 4:16 PM,  wrote:

> Hello,
>
> I am building package for spark application with the following sbt file:
>
> name := "Simple Project"
>
> version := "1.0"
>
> scalaVersion := "2.10.3"
>
> libraryDependencies ++= Seq(
>   "org.apache.spark"  %% "spark-core"  % "1.4.0" % "provided",
>   "org.apache.spark"  %% "spark-mllib" % "1.4.0",
>   "org.apache.spark"  %% "spark-sql"   % "1.4.0",
>   "org.apache.spark"  %% "spark-sql"   % "1.4.0"
>   )
> resolvers += "Akka Repository" at "http://repo.akka.io/releases/";
>
> I am getting TaskResultGetter error with ClassNotFoundException for
> scala.Some .
>
> Can I please get some help how to fix it?
>
> Thanks,
> S. Sarkar
>
> --
> You received this message because you are subscribed to the Google Groups
> "scala-user" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to scala-user+unsubscr...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Oliver Ruebenacker
Senior Software Engineer, Diabetes Portal
, Broad Institute



ERROR TaskResultGetter: Exception while getting task result java.io.IOException: java.lang.ClassNotFoundException: scala.Some

2016-06-15 Thread S Sarkar
Hello,

I built package for a spark application with the following sbt file:

name := "Simple Project"

version := "1.0"

scalaVersion := "2.10.3"

libraryDependencies ++= Seq(
  "org.apache.spark"  %% "spark-core"  % "1.4.0" % "provided",
  "org.apache.spark"  %% "spark-mllib" % "1.4.0",
  "org.apache.spark"  %% "spark-sql"   % "1.4.0",
  "org.apache.spark"  %% "spark-sql"   % "1.4.0"
  )
resolvers += "Akka Repository" at "http://repo.akka.io/releases/";

I am getting TaskResultGetter error with ClassNotFoundException for
scala.Some .

Can I please get some help how to fix it?  

Thanks,
S. Sarkar



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/ERROR-TaskResultGetter-Exception-while-getting-task-result-java-io-IOException-java-lang-ClassNotFoue-tp27178.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Analyzing json Data streams using sparkSQL in spark streaming returns java.lang.ClassNotFoundException

2016-03-08 Thread Tristan Nixon
this is a bit strange, because you’re trying to create an RDD inside of a 
foreach function (the jsonElements). This executes on the workers, and so will 
actually produce a different instance in each JVM on each worker, not one 
single RDD referenced by the driver, which is what I think you’re trying to get.

Why don’t you try something like:

JavaDStream jsonElements = lines.flatMap( … )

and just skip the lines.foreach?

> On Mar 8, 2016, at 11:59 AM, Nesrine BEN MUSTAPHA 
>  wrote:
> 
> Hello,
> 
> I tried to use sparkSQL to analyse json data streams within a standalone 
> application. 
> 
> here the code snippet that receive the streaming data: 
> final JavaReceiverInputDStream lines = 
> streamCtx.socketTextStream("localhost", Integer.parseInt(args[0]), 
> StorageLevel.MEMORY_AND_DISK_SER_2());
> 
> lines.foreachRDD((rdd) -> {
> 
> final JavaRDD jsonElements = rdd.flatMap(new FlatMapFunction String>() {
> 
> @Override
> 
> public Iterable call(final String line)
> 
> throws Exception {
> 
> return Arrays.asList(line.split("\n"));
> 
> }
> 
> }).filter(new Function() {
> 
> @Override
> 
> public Boolean call(final String v1)
> 
> throws Exception {
> 
> return v1.length() > 0;
> 
> }
> 
> });
> 
> //System.out.println("Data Received = " + jsonElements.collect().size());
> 
> final SQLContext sqlContext = 
> JavaSQLContextSingleton.getInstance(rdd.context());
> 
> final DataFrame dfJsonElement = sqlContext.read().json(jsonElements); 
> 
> executeSQLOperations(sqlContext, dfJsonElement);
> 
> });
> 
> streamCtx.start();
> 
> streamCtx.awaitTermination();
> 
> }
> 
> 
> 
> 
> 
> 
> 
> 
> 
> I got the following error when the red line is executed:
> 
> java.lang.ClassNotFoundException: 
> com.intrinsec.common.spark.SQLStreamingJsonAnalyzer$2
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:348)
>   at 
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68)
>   at 
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
>   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>   at 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
>   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>   at 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
> 
> 
> 
> 
> 
> 



Analyzing json Data streams using sparkSQL in spark streaming returns java.lang.ClassNotFoundException

2016-03-08 Thread Nesrine BEN MUSTAPHA
Hello,

I tried to use sparkSQL to analyse json data streams within a standalone
application.

here the code snippet that receive the streaming data:

*final JavaReceiverInputDStream lines =
streamCtx.socketTextStream("localhost", Integer.parseInt(args[0]),
StorageLevel.MEMORY_AND_DISK_SER_2());*

*lines.foreachRDD((rdd) -> {*

*final JavaRDD jsonElements = rdd.flatMap(new
FlatMapFunction() {*

*@Override*

*public Iterable call(final String line)*

*throws Exception {*

*return Arrays.asList(line.split("\n"));*

*}*

*}).filter(new Function() {*

*@Override*

*public Boolean call(final String v1)*

*throws Exception {*

*return v1.length() > 0;*

*}*

*});*

*//System.out.println("Data Received = " + jsonElements.collect().size());*

*final SQLContext sqlContext =
JavaSQLContextSingleton.getInstance(rdd.context());*

*final DataFrame dfJsonElement = sqlContext.read().json(jsonElements);
 *

*executeSQLOperations(sqlContext, dfJsonElement);*

*});*

*streamCtx.start();*

*streamCtx.awaitTermination();*

*}*


I got the following error when the red line is executed:

java.lang.ClassNotFoundException:
com.intrinsec.common.spark.SQLStreamingJsonAnalyzer$2
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:68)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)


starting start-master.sh throws "java.lang.ClassNotFoundException: org.slf4j.Logger" error

2015-11-26 Thread Mich Talebzadeh
Hi,

 

I just built spark without hive jars and trying to run

 

start-master.sh

 

I get this error in the log. Sounds like it cannot find
java.lang.ClassNotFoundException: org.slf4j.Logger

 

Spark Command: /usr/java/latest/bin/java -cp
/usr/lib/spark/sbin/../conf/:/usr/lib/spark/lib/spark-assembly-1.5.2-hadoop2
.6.0.jar -Xms1g -Xmx1g -XX:MaxPermSize=256m
org.apache.spark.deploy.master.Master --ip rhes564 --port 7077 --webui-port
8080



Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/Logger

at java.lang.Class.getDeclaredMethods0(Native Method)

at java.lang.Class.privateGetDeclaredMethods(Class.java:2521)

at java.lang.Class.getMethod0(Class.java:2764)

at java.lang.Class.getMethod(Class.java:1653)

at
sun.launcher.LauncherHelper.getMainMethod(LauncherHelper.java:494)

at
sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:486)

Caused by: java.lang.ClassNotFoundException: org.slf4j.Logger

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)

at java.net.URLClassLoader$1.run(URLClassLoader.java:355)

at java.security.AccessController.doPrivileged(Native Method)

at java.net.URLClassLoader.findClass(URLClassLoader.java:354)

at java.lang.ClassLoader.loadClass(ClassLoader.java:424)

at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)

at java.lang.ClassLoader.loadClass(ClassLoader.java:357)

... 6 more

 

Although I have added to the CLASSPATH.

 

Mich Talebzadeh

 

Sybase ASE 15 Gold Medal Award 2008

A Winning Strategy: Running the most Critical Financial Data on ASE 15

 
<http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908
.pdf>
http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.
pdf

Author of the books "A Practitioner's Guide to Upgrading to Sybase ASE 15",
ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN
978-0-9759693-0-4

Publications due shortly:

Complex Event Processing in Heterogeneous Environments, ISBN:
978-0-9563693-3-8

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume
one out shortly

 

 <http://talebzadehmich.wordpress.com/> http://talebzadehmich.wordpress.com

 

NOTE: The information in this email is proprietary and confidential. This
message is for the designated recipient only, if you are not the intended
recipient, you should destroy it immediately. Any information in this
message shall not be understood as given or endorsed by Peridale Technology
Ltd, its subsidiaries or their employees, unless expressly so stated. It is
the responsibility of the recipient to ensure that this email is virus free,
therefore neither Peridale Ltd, its subsidiaries nor their employees accept
any responsibility.

 



Spark+Groovy: java.lang.ClassNotFoundException: org.apache.spark.rpc.akka.AkkaRpcEnvFactory

2015-11-18 Thread tog
Hello

I am trying to use Spark from Groovy.
When using the grab feature supposed to download dependencies I am facing a
ClassNotFoundException

@Grab(group='org.apache.spark', module='spark-core_2.10', version='1.5.2')

I am trying to look at the jars that might be pulled by spark-core_2.10. I
downloaded the binary release for hadoop 2.6 but found that everything was
inside a fatjar spark-assembly-1.5.2-hadoop2.6.0.jar.
Is there anywhere a distribution with individual jars ?

Cheers
Guillaume


-- 
PGP KeyID: 2048R/EA31CFC9  subkeys.pgp.net


Re: java.lang.ClassNotFoundException: org.apache.spark.streaming.twitter.TwitterReceiver

2015-11-09 Thread DW @ Gmail
;>> >
>>>>>> > http://maven.apache.org/POM/4.0.0";
>>>>>> > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>>>>>> > xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
>>>>>> > http://maven.apache.org/xsd/maven-4.0.0.xsd";>
>>>>>> > 4.0.0
>>>>>> > SparkFirstTry
>>>>>> > SparkFirstTry
>>>>>> > 0.0.1-SNAPSHOT
>>>>>> >
>>>>>> > 
>>>>>> > 
>>>>>> > org.apache.spark
>>>>>> > spark-core_2.10
>>>>>> > 1.5.1
>>>>>> > provided
>>>>>> > 
>>>>>> >
>>>>>> > 
>>>>>> > org.apache.spark
>>>>>> > spark-streaming_2.10
>>>>>> > 1.5.1
>>>>>> > provided
>>>>>> > 
>>>>>> >
>>>>>> > 
>>>>>> > org.twitter4j
>>>>>> > twitter4j-stream
>>>>>> > 3.0.3
>>>>>> > 
>>>>>> > 
>>>>>> > org.apache.spark
>>>>>> > spark-streaming-twitter_2.10
>>>>>> > 1.0.0
>>>>>> > 
>>>>>> >
>>>>>> > 
>>>>>> >
>>>>>> > 
>>>>>> > src
>>>>>> > 
>>>>>> > 
>>>>>> > maven-compiler-plugin
>>>>>> > 3.3
>>>>>> > 
>>>>>> > 1.8
>>>>>> > 1.8
>>>>>> > 
>>>>>> > 
>>>>>> > 
>>>>>> > maven-assembly-plugin
>>>>>> > 
>>>>>> > 
>>>>>> > 
>>>>>> >
>>>>>> > com.test.sparkTest.SimpleApp
>>>>>> > 
>>>>>> > 
>>>>>> > 
>>>>>> > 
>>>>>> > jar-with-dependencies
>>>>>> > 
>>>>>> > 
>>>>>> > 
>>>>>> >
>>>>>> > 
>>>>>> > 
>>>>>> > 
>>>>>> >
>>>>>> >
>>>>>> > The application starts successfully but no tweets comes and this 
>>>>>> > exception
>>>>>> > is thrown
>>>>>> >
>>>>>> > 15/11/08 15:55:46 WARN TaskSetManager: Lost task 0.0 in stage 4.0 (TID 
>>>>>> > 78,
>>>>>> > 192.168.122.39): java.io.IOException: java.lang.ClassNotFoundException:
>>>>>> > org.apache.spark.streaming.twitter.TwitterReceiver
>>>>>> > at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1163)
>>>>>> > at
>>>>>> > org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:70)
>>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>> > at
>>>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>>> > at
>>>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>>> > at java.lang.reflect.Method.invoke(Method.java:497)
>>>>>> > at
>>>>>> > java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
>>>>>> > at 
>>>>>> > java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
>>>>>> > at
>>>>>> > java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>>>>>&

Re: java.lang.ClassNotFoundException: org.apache.spark.streaming.twitter.TwitterReceiver

2015-11-09 Thread Tathagata Das
>>>> > 
>>>>> > 
>>>>> > org.apache.spark
>>>>> > spark-core_2.10
>>>>> > 1.5.1
>>>>> > provided
>>>>> > 
>>>>> >
>>>>> > 
>>>>> > org.apache.spark
>>>>> > spark-streaming_2.10
>>>>> > 1.5.1
>>>>> > provided
>>>>> > 
>>>>> >
>>>>> > 
>>>>> > org.twitter4j
>>>>> > twitter4j-stream
>>>>> > 3.0.3
>>>>> > 
>>>>> > 
>>>>> > org.apache.spark
>>>>> > spark-streaming-twitter_2.10
>>>>> > 1.0.0
>>>>> > 
>>>>> >
>>>>> > 
>>>>> >
>>>>> > 
>>>>> > src
>>>>> > 
>>>>> > 
>>>>> > maven-compiler-plugin
>>>>> > 3.3
>>>>> > 
>>>>> > 1.8
>>>>> > 1.8
>>>>> > 
>>>>> > 
>>>>> > 
>>>>> > maven-assembly-plugin
>>>>> > 
>>>>> > 
>>>>> > 
>>>>> >
>>>>> > com.test.sparkTest.SimpleApp
>>>>> > 
>>>>> > 
>>>>> > 
>>>>> >
>>>>>  jar-with-dependencies
>>>>> > 
>>>>> > 
>>>>> > 
>>>>> >
>>>>> > 
>>>>> > 
>>>>> > 
>>>>> >
>>>>> >
>>>>> > The application starts successfully but no tweets comes and this
>>>>> exception
>>>>> > is thrown
>>>>> >
>>>>> > 15/11/08 15:55:46 WARN TaskSetManager: Lost task 0.0 in stage 4.0
>>>>> (TID 78,
>>>>> > 192.168.122.39): java.io.IOException:
>>>>> java.lang.ClassNotFoundException:
>>>>> > org.apache.spark.streaming.twitter.TwitterReceiver
>>>>> > at
>>>>> org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1163)
>>>>> > at
>>>>> >
>>>>> org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:70)
>>>>> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>> > at
>>>>> >
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>>>>> > at
>>>>> >
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>>>> > at java.lang.reflect.Method.invoke(Method.java:497)
>>>>> > at
>>>>> >
>>>>> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
>>>>> > at
>>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
>>>>> > at
>>>>> >
>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>>>>> > at
>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>>> > at
>>>>> >
>>>>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
>>>>> > at
>>>>> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
>>>>> > at
>>>>> >
>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>>>>> > at
>>>>> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>>> > at
>>>>> java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>&g

Re: java.lang.ClassNotFoundException: org.apache.spark.streaming.twitter.TwitterReceiver

2015-11-09 Thread أنس الليثي
If I packaged the application and submit it, it works fine but I need to
run it from eclipse.

Is there any problem running the application from eclipse ?



On 9 November 2015 at 12:27, Tathagata Das  wrote:

> How are you submitting the spark application?
> You are supposed to submit the fat-jar of the application that include the
> spark-streaming-twitter dependency (and its subdeps) but not
> spark-streaming and spark-core.
>
> On Mon, Nov 9, 2015 at 1:02 AM, أنس الليثي  wrote:
>
>> I tried to remove maven and adding the dependencies manually using build
>> path > configure build path > add external jars, then adding the jars
>> manually but it did not work.
>>
>> I tried to create another project and copied the code from the first app
>> but the problem still the same.
>>
>> I event tried to change eclipse with another version, but the same
>> problem exist.
>>
>> :( :( :( :(
>>
>> On 9 November 2015 at 10:47, أنس الليثي  wrote:
>>
>>> I tried both, the same exception still thrown
>>>
>>> On 9 November 2015 at 10:45, Sean Owen  wrote:
>>>
>>>> You included a very old version of the Twitter jar - 1.0.0. Did you
>>>> mean 1.5.1?
>>>>
>>>> On Mon, Nov 9, 2015 at 7:36 AM, fanooos  wrote:
>>>> > This is my first Spark Stream application. The setup is as following
>>>> >
>>>> > 3 nodes running a spark cluster. One master node and two slaves.
>>>> >
>>>> > The application is a simple java application streaming from twitter
>>>> and
>>>> > dependencies managed by maven.
>>>> >
>>>> > Here is the code of the application
>>>> >
>>>> > public class SimpleApp {
>>>> >
>>>> > public static void main(String[] args) {
>>>> >
>>>> > SparkConf conf = new SparkConf().setAppName("Simple
>>>> > Application").setMaster("spark://rethink-node01:7077");
>>>> >
>>>> > JavaStreamingContext sc = new JavaStreamingContext(conf, new
>>>> > Duration(1000));
>>>> >
>>>> > ConfigurationBuilder cb = new ConfigurationBuilder();
>>>> >
>>>> > cb.setDebugEnabled(true).setOAuthConsumerKey("ConsumerKey")
>>>> > .setOAuthConsumerSecret("ConsumerSecret")
>>>> > .setOAuthAccessToken("AccessToken")
>>>> > .setOAuthAccessTokenSecret("TokenSecret");
>>>> >
>>>> > OAuthAuthorization auth = new OAuthAuthorization(cb.build());
>>>> >
>>>> > JavaDStream tweets = TwitterUtils.createStream(sc,
>>>> auth);
>>>> >
>>>> >  JavaDStream statuses = tweets.map(new
>>>> Function>>> > String>() {
>>>> >  public String call(Status status) throws Exception {
>>>> > return status.getText();
>>>> > }
>>>> > });
>>>> >
>>>> >  statuses.print();;
>>>> >
>>>> >  sc.start();
>>>> >
>>>> >  sc.awaitTermination();
>>>> >
>>>> > }
>>>> >
>>>> > }
>>>> >
>>>> >
>>>> > here is the pom file
>>>> >
>>>> > http://maven.apache.org/POM/4.0.0";
>>>> > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>>>> > xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
>>>> > http://maven.apache.org/xsd/maven-4.0.0.xsd";>
>>>> > 4.0.0
>>>> > SparkFirstTry
>>>> > SparkFirstTry
>>>> > 0.0.1-SNAPSHOT
>>>> >
>>>> > 
>>>> > 
>>>> > org.apache.spark
>>>> > spark-core_2.10
>>>> > 1.5.1
>>>> > provided
>>>> > 
>>>> >
>>>> > 
>>>> > org.apache.spark
>>>> > spark-streaming_2.10
>>>> > 1.5.1
>>>> > provided
>>>> > 
>>>> >
>>

Re: java.lang.ClassNotFoundException: org.apache.spark.streaming.twitter.TwitterReceiver

2015-11-09 Thread Tathagata Das
How are you submitting the spark application?
You are supposed to submit the fat-jar of the application that include the
spark-streaming-twitter dependency (and its subdeps) but not
spark-streaming and spark-core.

On Mon, Nov 9, 2015 at 1:02 AM, أنس الليثي  wrote:

> I tried to remove maven and adding the dependencies manually using build
> path > configure build path > add external jars, then adding the jars
> manually but it did not work.
>
> I tried to create another project and copied the code from the first app
> but the problem still the same.
>
> I event tried to change eclipse with another version, but the same problem
> exist.
>
> :( :( :( :(
>
> On 9 November 2015 at 10:47, أنس الليثي  wrote:
>
>> I tried both, the same exception still thrown
>>
>> On 9 November 2015 at 10:45, Sean Owen  wrote:
>>
>>> You included a very old version of the Twitter jar - 1.0.0. Did you mean
>>> 1.5.1?
>>>
>>> On Mon, Nov 9, 2015 at 7:36 AM, fanooos  wrote:
>>> > This is my first Spark Stream application. The setup is as following
>>> >
>>> > 3 nodes running a spark cluster. One master node and two slaves.
>>> >
>>> > The application is a simple java application streaming from twitter and
>>> > dependencies managed by maven.
>>> >
>>> > Here is the code of the application
>>> >
>>> > public class SimpleApp {
>>> >
>>> > public static void main(String[] args) {
>>> >
>>> > SparkConf conf = new SparkConf().setAppName("Simple
>>> > Application").setMaster("spark://rethink-node01:7077");
>>> >
>>> > JavaStreamingContext sc = new JavaStreamingContext(conf, new
>>> > Duration(1000));
>>> >
>>> > ConfigurationBuilder cb = new ConfigurationBuilder();
>>> >
>>> > cb.setDebugEnabled(true).setOAuthConsumerKey("ConsumerKey")
>>> > .setOAuthConsumerSecret("ConsumerSecret")
>>> > .setOAuthAccessToken("AccessToken")
>>> > .setOAuthAccessTokenSecret("TokenSecret");
>>> >
>>> > OAuthAuthorization auth = new OAuthAuthorization(cb.build());
>>> >
>>> > JavaDStream tweets = TwitterUtils.createStream(sc,
>>> auth);
>>> >
>>> >  JavaDStream statuses = tweets.map(new Function>> > String>() {
>>> >  public String call(Status status) throws Exception {
>>> > return status.getText();
>>> > }
>>> > });
>>> >
>>> >  statuses.print();;
>>> >
>>> >  sc.start();
>>> >
>>> >  sc.awaitTermination();
>>> >
>>> > }
>>> >
>>> > }
>>> >
>>> >
>>> > here is the pom file
>>> >
>>> > http://maven.apache.org/POM/4.0.0";
>>> > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>>> > xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
>>> > http://maven.apache.org/xsd/maven-4.0.0.xsd";>
>>> > 4.0.0
>>> > SparkFirstTry
>>> > SparkFirstTry
>>> > 0.0.1-SNAPSHOT
>>> >
>>> > 
>>> > 
>>> > org.apache.spark
>>> > spark-core_2.10
>>> > 1.5.1
>>> > provided
>>> > 
>>> >
>>> > 
>>> > org.apache.spark
>>> > spark-streaming_2.10
>>> > 1.5.1
>>> > provided
>>> > 
>>> >
>>> > 
>>> > org.twitter4j
>>> > twitter4j-stream
>>> > 3.0.3
>>> > 
>>> > 
>>> > org.apache.spark
>>> > spark-streaming-twitter_2.10
>>> > 1.0.0
>>> > 
>>> >
>>> > 
>>> >
>>> > 
>>> > src
>>> > 
>>> > 
>>> > maven-compiler-plugin
>>> > 3.3
>>> > 
>&g

Re: java.lang.ClassNotFoundException: org.apache.spark.streaming.twitter.TwitterReceiver

2015-11-09 Thread أنس الليثي
I tried to remove maven and adding the dependencies manually using build
path > configure build path > add external jars, then adding the jars
manually but it did not work.

I tried to create another project and copied the code from the first app
but the problem still the same.

I event tried to change eclipse with another version, but the same problem
exist.

:( :( :( :(

On 9 November 2015 at 10:47, أنس الليثي  wrote:

> I tried both, the same exception still thrown
>
> On 9 November 2015 at 10:45, Sean Owen  wrote:
>
>> You included a very old version of the Twitter jar - 1.0.0. Did you mean
>> 1.5.1?
>>
>> On Mon, Nov 9, 2015 at 7:36 AM, fanooos  wrote:
>> > This is my first Spark Stream application. The setup is as following
>> >
>> > 3 nodes running a spark cluster. One master node and two slaves.
>> >
>> > The application is a simple java application streaming from twitter and
>> > dependencies managed by maven.
>> >
>> > Here is the code of the application
>> >
>> > public class SimpleApp {
>> >
>> > public static void main(String[] args) {
>> >
>> > SparkConf conf = new SparkConf().setAppName("Simple
>> > Application").setMaster("spark://rethink-node01:7077");
>> >
>> > JavaStreamingContext sc = new JavaStreamingContext(conf, new
>> > Duration(1000));
>> >
>> > ConfigurationBuilder cb = new ConfigurationBuilder();
>> >
>> > cb.setDebugEnabled(true).setOAuthConsumerKey("ConsumerKey")
>> > .setOAuthConsumerSecret("ConsumerSecret")
>> > .setOAuthAccessToken("AccessToken")
>> > .setOAuthAccessTokenSecret("TokenSecret");
>> >
>> > OAuthAuthorization auth = new OAuthAuthorization(cb.build());
>> >
>> > JavaDStream tweets = TwitterUtils.createStream(sc,
>> auth);
>> >
>> >  JavaDStream statuses = tweets.map(new Function> > String>() {
>> >  public String call(Status status) throws Exception {
>> > return status.getText();
>> > }
>> > });
>> >
>> >  statuses.print();;
>> >
>> >  sc.start();
>> >
>> >  sc.awaitTermination();
>> >
>> > }
>> >
>> > }
>> >
>> >
>> > here is the pom file
>> >
>> > http://maven.apache.org/POM/4.0.0";
>> > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>> > xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
>> > http://maven.apache.org/xsd/maven-4.0.0.xsd";>
>> > 4.0.0
>> > SparkFirstTry
>> > SparkFirstTry
>> > 0.0.1-SNAPSHOT
>> >
>> > 
>> > 
>> > org.apache.spark
>> > spark-core_2.10
>> > 1.5.1
>> > provided
>> > 
>> >
>> > 
>> > org.apache.spark
>> > spark-streaming_2.10
>> > 1.5.1
>> > provided
>> > 
>> >
>> > 
>> > org.twitter4j
>> > twitter4j-stream
>> > 3.0.3
>> > 
>> > 
>> > org.apache.spark
>> > spark-streaming-twitter_2.10
>> > 1.0.0
>> > 
>> >
>> > 
>> >
>> > 
>> > src
>> > 
>> > 
>> > maven-compiler-plugin
>> > 3.3
>> > 
>> > 1.8
>> > 1.8
>> > 
>> > 
>> > 
>> > maven-assembly-plugin
>> > 
>> > 
>> > 
>> >
>> > com.test.sparkTest.SimpleApp
>> > 
>> > 
>> > 
>> >
>>  jar-with-dependencies
>> > 
>> > 
>> > 
>> >
>> > 
>> > 
>> > 
>> >
>> >
>> > The application starts successfully 

Re: java.lang.ClassNotFoundException: org.apache.spark.streaming.twitter.TwitterReceiver

2015-11-08 Thread أنس الليثي
I tried both, the same exception still thrown

On 9 November 2015 at 10:45, Sean Owen  wrote:

> You included a very old version of the Twitter jar - 1.0.0. Did you mean
> 1.5.1?
>
> On Mon, Nov 9, 2015 at 7:36 AM, fanooos  wrote:
> > This is my first Spark Stream application. The setup is as following
> >
> > 3 nodes running a spark cluster. One master node and two slaves.
> >
> > The application is a simple java application streaming from twitter and
> > dependencies managed by maven.
> >
> > Here is the code of the application
> >
> > public class SimpleApp {
> >
> > public static void main(String[] args) {
> >
> > SparkConf conf = new SparkConf().setAppName("Simple
> > Application").setMaster("spark://rethink-node01:7077");
> >
> > JavaStreamingContext sc = new JavaStreamingContext(conf, new
> > Duration(1000));
> >
> > ConfigurationBuilder cb = new ConfigurationBuilder();
> >
> > cb.setDebugEnabled(true).setOAuthConsumerKey("ConsumerKey")
> > .setOAuthConsumerSecret("ConsumerSecret")
> > .setOAuthAccessToken("AccessToken")
> > .setOAuthAccessTokenSecret("TokenSecret");
> >
> > OAuthAuthorization auth = new OAuthAuthorization(cb.build());
> >
> > JavaDStream tweets = TwitterUtils.createStream(sc, auth);
> >
> >  JavaDStream statuses = tweets.map(new Function > String>() {
> >  public String call(Status status) throws Exception {
> > return status.getText();
> > }
> > });
> >
> >  statuses.print();;
> >
> >  sc.start();
> >
> >  sc.awaitTermination();
> >
> > }
> >
> > }
> >
> >
> > here is the pom file
> >
> > http://maven.apache.org/POM/4.0.0";
> > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
> > xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
> > http://maven.apache.org/xsd/maven-4.0.0.xsd";>
> > 4.0.0
> > SparkFirstTry
> > SparkFirstTry
> > 0.0.1-SNAPSHOT
> >
> > 
> > 
> > org.apache.spark
> > spark-core_2.10
> > 1.5.1
> > provided
> > 
> >
> > 
> > org.apache.spark
> > spark-streaming_2.10
> > 1.5.1
> > provided
> > 
> >
> > 
> > org.twitter4j
> > twitter4j-stream
> > 3.0.3
> > 
> > 
> > org.apache.spark
> > spark-streaming-twitter_2.10
> > 1.0.0
> > 
> >
> > 
> >
> > 
> > src
> > 
> > 
> > maven-compiler-plugin
> > 3.3
> > 
> > 1.8
> > 1.8
> > 
> > 
> > 
> > maven-assembly-plugin
> > 
> > 
> > 
> >
> > com.test.sparkTest.SimpleApp
> > 
> > 
> > 
> >
>  jar-with-dependencies
> > 
> > 
> > 
> >
> > 
> > 
> > 
> >
> >
> > The application starts successfully but no tweets comes and this
> exception
> > is thrown
> >
> > 15/11/08 15:55:46 WARN TaskSetManager: Lost task 0.0 in stage 4.0 (TID
> 78,
> > 192.168.122.39): java.io.IOException: java.lang.ClassNotFoundException:
> > org.apache.spark.streaming.twitter.TwitterReceiver
> > at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1163)
> > at
> >
> org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:70)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:497)
> > at
> > java.io.ObjectStreamClass.invo

Re: java.lang.ClassNotFoundException: org.apache.spark.streaming.twitter.TwitterReceiver

2015-11-08 Thread Sean Owen
You included a very old version of the Twitter jar - 1.0.0. Did you mean 1.5.1?

On Mon, Nov 9, 2015 at 7:36 AM, fanooos  wrote:
> This is my first Spark Stream application. The setup is as following
>
> 3 nodes running a spark cluster. One master node and two slaves.
>
> The application is a simple java application streaming from twitter and
> dependencies managed by maven.
>
> Here is the code of the application
>
> public class SimpleApp {
>
> public static void main(String[] args) {
>
> SparkConf conf = new SparkConf().setAppName("Simple
> Application").setMaster("spark://rethink-node01:7077");
>
> JavaStreamingContext sc = new JavaStreamingContext(conf, new
> Duration(1000));
>
> ConfigurationBuilder cb = new ConfigurationBuilder();
>
> cb.setDebugEnabled(true).setOAuthConsumerKey("ConsumerKey")
> .setOAuthConsumerSecret("ConsumerSecret")
> .setOAuthAccessToken("AccessToken")
> .setOAuthAccessTokenSecret("TokenSecret");
>
> OAuthAuthorization auth = new OAuthAuthorization(cb.build());
>
> JavaDStream tweets = TwitterUtils.createStream(sc, auth);
>
>  JavaDStream statuses = tweets.map(new Function String>() {
>  public String call(Status status) throws Exception {
> return status.getText();
> }
> });
>
>  statuses.print();;
>
>  sc.start();
>
>  sc.awaitTermination();
>
> }
>
> }
>
>
> here is the pom file
>
> http://maven.apache.org/POM/4.0.0";
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
> xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
> http://maven.apache.org/xsd/maven-4.0.0.xsd";>
> 4.0.0
> SparkFirstTry
> SparkFirstTry
> 0.0.1-SNAPSHOT
>
> 
> 
> org.apache.spark
> spark-core_2.10
> 1.5.1
> provided
> 
>
> 
> org.apache.spark
> spark-streaming_2.10
> 1.5.1
> provided
> 
>
> 
> org.twitter4j
> twitter4j-stream
> 3.0.3
> 
> 
> org.apache.spark
> spark-streaming-twitter_2.10
> 1.0.0
> 
>
> 
>
> 
> src
> 
> 
> maven-compiler-plugin
> 3.3
> 
> 1.8
> 1.8
> 
> 
> 
> maven-assembly-plugin
> 
> 
> 
>
> com.test.sparkTest.SimpleApp
>     
> 
> 
> jar-with-dependencies
> 
> 
> 
>
> 
> 
> 
>
>
> The application starts successfully but no tweets comes and this exception
> is thrown
>
> 15/11/08 15:55:46 WARN TaskSetManager: Lost task 0.0 in stage 4.0 (TID 78,
> 192.168.122.39): java.io.IOException: java.lang.ClassNotFoundException:
> org.apache.spark.streaming.twitter.TwitterReceiver
> at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1163)
> at
> org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:70)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
> at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
> at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSeri

java.lang.ClassNotFoundException: org.apache.spark.streaming.twitter.TwitterReceiver

2015-11-08 Thread fanooos
This is my first Spark Stream application. The setup is as following

3 nodes running a spark cluster. One master node and two slaves.

The application is a simple java application streaming from twitter and
dependencies managed by maven.

Here is the code of the application

public class SimpleApp {

public static void main(String[] args) {

SparkConf conf = new SparkConf().setAppName("Simple
Application").setMaster("spark://rethink-node01:7077");

JavaStreamingContext sc = new JavaStreamingContext(conf, new
Duration(1000));

ConfigurationBuilder cb = new ConfigurationBuilder();

cb.setDebugEnabled(true).setOAuthConsumerKey("ConsumerKey")
.setOAuthConsumerSecret("ConsumerSecret")
.setOAuthAccessToken("AccessToken")
.setOAuthAccessTokenSecret("TokenSecret");

OAuthAuthorization auth = new OAuthAuthorization(cb.build());

JavaDStream tweets = TwitterUtils.createStream(sc, auth);

 JavaDStream statuses = tweets.map(new Function() {
 public String call(Status status) throws Exception {
return status.getText();
}
});

 statuses.print();;

 sc.start();

 sc.awaitTermination();

}

}


here is the pom file

http://maven.apache.org/POM/4.0.0";
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd";>
4.0.0
SparkFirstTry
SparkFirstTry
0.0.1-SNAPSHOT


 
org.apache.spark
spark-core_2.10
1.5.1
provided



org.apache.spark
spark-streaming_2.10
1.5.1
provided



org.twitter4j
twitter4j-stream
3.0.3


org.apache.spark
spark-streaming-twitter_2.10
1.0.0





src


maven-compiler-plugin
3.3

1.8
1.8



maven-assembly-plugin



   
com.test.sparkTest.SimpleApp



jar-with-dependencies









The application starts successfully but no tweets comes and this exception
is thrown

15/11/08 15:55:46 WARN TaskSetManager: Lost task 0.0 in stage 4.0 (TID 78,
192.168.122.39): java.io.IOException: java.lang.ClassNotFoundException:
org.apache.spark.streaming.twitter.TwitterReceiver
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1163)
at
org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:70)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
at
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at
org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
at
org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:194)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException:
org.apache.spark.streaming.twitter.TwitterReceiver
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at
org.apache.spark.s

Re: java.lang.ClassNotFoundException

2015-08-08 Thread Yasemin Kaya
Thanx Ted, i solved it :)

2015-08-08 14:07 GMT+03:00 Ted Yu :

> Have you tried including package name in the class name ?
>
> Thanks
>
>
>
> On Aug 8, 2015, at 12:00 AM, Yasemin Kaya  wrote:
>
> Hi,
>
> I have a little spark program and i am getting an error why i dont
> understand.
> My code is https://gist.github.com/yaseminn/522a75b863ad78934bc3.
> I am using spark 1.3
> Submitting : bin/spark-submit --class MonthlyAverage --master local[4]
> weather.jar
>
>
> error:
>
> ~/spark-1.3.1-bin-hadoop2.4$ bin/spark-submit --class MonthlyAverage
> --master local[4] weather.jar
> java.lang.ClassNotFoundException: MonthlyAverage
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:274)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:538)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Using Spark's default log4j profile:
> org/apache/spark/log4j-defaults.properties
>
>
> Please help me Asap..
>
> yasemin
> --
> hiç ender hiç
>
>


-- 
hiç ender hiç


Re: java.lang.ClassNotFoundException

2015-08-08 Thread Ted Yu
Have you tried including package name in the class name ?

Thanks



> On Aug 8, 2015, at 12:00 AM, Yasemin Kaya  wrote:
> 
> Hi,
> 
> I have a little spark program and i am getting an error why i dont 
> understand. 
> My code is https://gist.github.com/yaseminn/522a75b863ad78934bc3.
> I am using spark 1.3 
> Submitting : bin/spark-submit --class MonthlyAverage --master local[4] 
> weather.jar
> 
> 
> error: 
> 
> ~/spark-1.3.1-bin-hadoop2.4$ bin/spark-submit --class MonthlyAverage --master 
> local[4] weather.jar
> java.lang.ClassNotFoundException: MonthlyAverage
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>   at java.lang.Class.forName0(Native Method)
>   at java.lang.Class.forName(Class.java:274)
>   at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:538)
>   at 
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
>   at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
>   at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
>   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Using Spark's default log4j profile: 
> org/apache/spark/log4j-defaults.properties
> 
> 
> Please help me Asap..
> 
> yasemin
> -- 
> hiç ender hiç


java.lang.ClassNotFoundException

2015-08-08 Thread Yasemin Kaya
Hi,

I have a little spark program and i am getting an error why i dont
understand.
My code is https://gist.github.com/yaseminn/522a75b863ad78934bc3.
I am using spark 1.3
Submitting : bin/spark-submit --class MonthlyAverage --master local[4]
weather.jar


error:

~/spark-1.3.1-bin-hadoop2.4$ bin/spark-submit --class MonthlyAverage
--master local[4] weather.jar
java.lang.ClassNotFoundException: MonthlyAverage
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:538)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties


Please help me Asap..

yasemin
-- 
hiç ender hiç


Re: Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

2015-03-27 Thread Tathagata Das
Seems like a bug, could you file a JIRA?

@Tim: Patrick said you take a look at Mesos related issues. Could you take
a look at this. Thanks!

TD

On Fri, Mar 27, 2015 at 1:25 PM, Ondrej Smola 
wrote:

> Yes, only when using fine grained mode and replication 
> (StorageLevel.MEMORY_ONLY_2
> etc).
>
> 2015-03-27 19:06 GMT+01:00 Tathagata Das :
>
>> Does it fail with just Spark jobs (using storage levels) on non-coarse
>> mode?
>>
>> TD
>>
>> On Fri, Mar 27, 2015 at 4:39 AM, Ondrej Smola 
>> wrote:
>>
>>> More info
>>>
>>> when using *spark.mesos.coarse* everything works as expected. I think
>>> this must be a bug in spark-mesos integration.
>>>
>>>
>>> 2015-03-27 9:23 GMT+01:00 Ondrej Smola :
>>>
>>>> It happens only when StorageLevel is used with 1 replica (
>>>> StorageLevel.MEMORY_ONLY_2,StorageLevel.MEMORY_AND_DISK_2) ,
>>>> StorageLevel.MEMORY_ONLY ,StorageLevel.MEMORY_AND_DISK works - the
>>>> problems must be clearly somewhere between mesos-spark . From console I see
>>>> that spark is trying to replicate to nodes -> nodes show up in Mesos active
>>>> tasks ... but they always fail with ClassNotFoundE.
>>>>
>>>> 2015-03-27 0:52 GMT+01:00 Tathagata Das :
>>>>
>>>>> Could you try running a simpler spark streaming program with receiver
>>>>> (may be socketStream) and see if that works.
>>>>>
>>>>> TD
>>>>>
>>>>> On Thu, Mar 26, 2015 at 2:08 PM, Ondrej Smola 
>>>>> wrote:
>>>>>
>>>>>> Hi thanks for reply,
>>>>>>
>>>>>> yes I have custom receiver -> but it has simple logic .. pop ids from
>>>>>> redis queue -> load docs based on ids from elastic and store them in 
>>>>>> spark.
>>>>>> No classloader modifications. I am running multiple Spark batch jobs 
>>>>>> (with
>>>>>> user supplied partitioning) and they have no problems, debug in local 
>>>>>> mode
>>>>>> show no errors.
>>>>>>
>>>>>> 2015-03-26 21:47 GMT+01:00 Tathagata Das :
>>>>>>
>>>>>>> Here are few steps to debug.
>>>>>>>
>>>>>>> 1. Try using replication from a Spark job: sc.parallelize(1 to 100,
>>>>>>> 100).persist(StorageLevel.MEMORY_ONLY_2).count()
>>>>>>> 2. If one works, then we know that there is probably nothing wrong
>>>>>>> with the Spark installation, and probably in the threads related to the
>>>>>>> receivers receiving the data. Are you writing a custom receiver? Are you
>>>>>>> somehow playing around with the class loader in the custom receiver?
>>>>>>>
>>>>>>> TD
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola <
>>>>>>> ondrej.sm...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I am running spark streaming v 1.3.0 (running inside Docker) on
>>>>>>>> Mesos 0.21.1. Spark streaming is started using Marathon -> docker 
>>>>>>>> container
>>>>>>>> gets deployed and starts streaming (from custom Actor). Spark binary is
>>>>>>>> located on shared GlusterFS volume. Data is streamed from
>>>>>>>> Elasticsearch/Redis. When new batch arrives Spark tries to replicate 
>>>>>>>> it but
>>>>>>>> fails with following error :
>>>>>>>>
>>>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840
>>>>>>>> dropped from memory (free 278017782)
>>>>>>>> 15/03/26 14:50:00 INFO BlockManager: Removing block
>>>>>>>> broadcast_0_piece0
>>>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of
>>>>>>>> size 1658 dropped from memory (free 278019440)
>>>>>>>> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
>>>>>>>> broadcast_0_piece0
>>>>>>>> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while
>>>>>>>> invoking RpcHandler#receive() on RPC id 717876

Re: Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

2015-03-27 Thread Ondrej Smola
Yes, only when using fine grained mode and replication
(StorageLevel.MEMORY_ONLY_2
etc).

2015-03-27 19:06 GMT+01:00 Tathagata Das :

> Does it fail with just Spark jobs (using storage levels) on non-coarse
> mode?
>
> TD
>
> On Fri, Mar 27, 2015 at 4:39 AM, Ondrej Smola 
> wrote:
>
>> More info
>>
>> when using *spark.mesos.coarse* everything works as expected. I think
>> this must be a bug in spark-mesos integration.
>>
>>
>> 2015-03-27 9:23 GMT+01:00 Ondrej Smola :
>>
>>> It happens only when StorageLevel is used with 1 replica ( StorageLevel.
>>> MEMORY_ONLY_2,StorageLevel.MEMORY_AND_DISK_2) , StorageLevel.MEMORY_ONLY
>>> ,StorageLevel.MEMORY_AND_DISK works - the problems must be clearly
>>> somewhere between mesos-spark . From console I see that spark is trying to
>>> replicate to nodes -> nodes show up in Mesos active tasks ... but they
>>> always fail with ClassNotFoundE.
>>>
>>> 2015-03-27 0:52 GMT+01:00 Tathagata Das :
>>>
>>>> Could you try running a simpler spark streaming program with receiver
>>>> (may be socketStream) and see if that works.
>>>>
>>>> TD
>>>>
>>>> On Thu, Mar 26, 2015 at 2:08 PM, Ondrej Smola 
>>>> wrote:
>>>>
>>>>> Hi thanks for reply,
>>>>>
>>>>> yes I have custom receiver -> but it has simple logic .. pop ids from
>>>>> redis queue -> load docs based on ids from elastic and store them in 
>>>>> spark.
>>>>> No classloader modifications. I am running multiple Spark batch jobs (with
>>>>> user supplied partitioning) and they have no problems, debug in local mode
>>>>> show no errors.
>>>>>
>>>>> 2015-03-26 21:47 GMT+01:00 Tathagata Das :
>>>>>
>>>>>> Here are few steps to debug.
>>>>>>
>>>>>> 1. Try using replication from a Spark job: sc.parallelize(1 to 100,
>>>>>> 100).persist(StorageLevel.MEMORY_ONLY_2).count()
>>>>>> 2. If one works, then we know that there is probably nothing wrong
>>>>>> with the Spark installation, and probably in the threads related to the
>>>>>> receivers receiving the data. Are you writing a custom receiver? Are you
>>>>>> somehow playing around with the class loader in the custom receiver?
>>>>>>
>>>>>> TD
>>>>>>
>>>>>>
>>>>>> On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola <
>>>>>> ondrej.sm...@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am running spark streaming v 1.3.0 (running inside Docker) on
>>>>>>> Mesos 0.21.1. Spark streaming is started using Marathon -> docker 
>>>>>>> container
>>>>>>> gets deployed and starts streaming (from custom Actor). Spark binary is
>>>>>>> located on shared GlusterFS volume. Data is streamed from
>>>>>>> Elasticsearch/Redis. When new batch arrives Spark tries to replicate it 
>>>>>>> but
>>>>>>> fails with following error :
>>>>>>>
>>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840
>>>>>>> dropped from memory (free 278017782)
>>>>>>> 15/03/26 14:50:00 INFO BlockManager: Removing block
>>>>>>> broadcast_0_piece0
>>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of size
>>>>>>> 1658 dropped from memory (free 278019440)
>>>>>>> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
>>>>>>> broadcast_0_piece0
>>>>>>> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while
>>>>>>> invoking RpcHandler#receive() on RPC id 7178767328921933569
>>>>>>> java.lang.ClassNotFoundException:
>>>>>>> org/apache/spark/storage/StorageLevel
>>>>>>> at java.lang.Class.forName0(Native Method)
>>>>>>> at java.lang.Class.forName(Class.java:344)
>>>>>>> at
>>>>>>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65)
>>>>>>> at
>>>>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
>>>>>>> 

Re: Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

2015-03-27 Thread Tathagata Das
Does it fail with just Spark jobs (using storage levels) on non-coarse mode?

TD

On Fri, Mar 27, 2015 at 4:39 AM, Ondrej Smola 
wrote:

> More info
>
> when using *spark.mesos.coarse* everything works as expected. I think
> this must be a bug in spark-mesos integration.
>
>
> 2015-03-27 9:23 GMT+01:00 Ondrej Smola :
>
>> It happens only when StorageLevel is used with 1 replica ( StorageLevel.
>> MEMORY_ONLY_2,StorageLevel.MEMORY_AND_DISK_2) , StorageLevel.MEMORY_ONLY
>> ,StorageLevel.MEMORY_AND_DISK works - the problems must be clearly
>> somewhere between mesos-spark . From console I see that spark is trying to
>> replicate to nodes -> nodes show up in Mesos active tasks ... but they
>> always fail with ClassNotFoundE.
>>
>> 2015-03-27 0:52 GMT+01:00 Tathagata Das :
>>
>>> Could you try running a simpler spark streaming program with receiver
>>> (may be socketStream) and see if that works.
>>>
>>> TD
>>>
>>> On Thu, Mar 26, 2015 at 2:08 PM, Ondrej Smola 
>>> wrote:
>>>
>>>> Hi thanks for reply,
>>>>
>>>> yes I have custom receiver -> but it has simple logic .. pop ids from
>>>> redis queue -> load docs based on ids from elastic and store them in spark.
>>>> No classloader modifications. I am running multiple Spark batch jobs (with
>>>> user supplied partitioning) and they have no problems, debug in local mode
>>>> show no errors.
>>>>
>>>> 2015-03-26 21:47 GMT+01:00 Tathagata Das :
>>>>
>>>>> Here are few steps to debug.
>>>>>
>>>>> 1. Try using replication from a Spark job: sc.parallelize(1 to 100,
>>>>> 100).persist(StorageLevel.MEMORY_ONLY_2).count()
>>>>> 2. If one works, then we know that there is probably nothing wrong
>>>>> with the Spark installation, and probably in the threads related to the
>>>>> receivers receiving the data. Are you writing a custom receiver? Are you
>>>>> somehow playing around with the class loader in the custom receiver?
>>>>>
>>>>> TD
>>>>>
>>>>>
>>>>> On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola >>>> > wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am running spark streaming v 1.3.0 (running inside Docker) on Mesos
>>>>>> 0.21.1. Spark streaming is started using Marathon -> docker container 
>>>>>> gets
>>>>>> deployed and starts streaming (from custom Actor). Spark binary is 
>>>>>> located
>>>>>> on shared GlusterFS volume. Data is streamed from Elasticsearch/Redis. 
>>>>>> When
>>>>>> new batch arrives Spark tries to replicate it but fails with following
>>>>>> error :
>>>>>>
>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840
>>>>>> dropped from memory (free 278017782)
>>>>>> 15/03/26 14:50:00 INFO BlockManager: Removing block broadcast_0_piece0
>>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of size
>>>>>> 1658 dropped from memory (free 278019440)
>>>>>> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
>>>>>> broadcast_0_piece0
>>>>>> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while invoking
>>>>>> RpcHandler#receive() on RPC id 7178767328921933569
>>>>>> java.lang.ClassNotFoundException:
>>>>>> org/apache/spark/storage/StorageLevel
>>>>>> at java.lang.Class.forName0(Native Method)
>>>>>> at java.lang.Class.forName(Class.java:344)
>>>>>> at
>>>>>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65)
>>>>>> at
>>>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
>>>>>> at
>>>>>> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>>>>>> at
>>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
>>>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>>>> at
>>>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializ

Re: Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

2015-03-27 Thread Ondrej Smola
More info

when using *spark.mesos.coarse* everything works as expected. I think this
must be a bug in spark-mesos integration.


2015-03-27 9:23 GMT+01:00 Ondrej Smola :

> It happens only when StorageLevel is used with 1 replica ( StorageLevel.
> MEMORY_ONLY_2,StorageLevel.MEMORY_AND_DISK_2) , StorageLevel.MEMORY_ONLY ,
> StorageLevel.MEMORY_AND_DISK works - the problems must be clearly
> somewhere between mesos-spark . From console I see that spark is trying to
> replicate to nodes -> nodes show up in Mesos active tasks ... but they
> always fail with ClassNotFoundE.
>
> 2015-03-27 0:52 GMT+01:00 Tathagata Das :
>
>> Could you try running a simpler spark streaming program with receiver
>> (may be socketStream) and see if that works.
>>
>> TD
>>
>> On Thu, Mar 26, 2015 at 2:08 PM, Ondrej Smola 
>> wrote:
>>
>>> Hi thanks for reply,
>>>
>>> yes I have custom receiver -> but it has simple logic .. pop ids from
>>> redis queue -> load docs based on ids from elastic and store them in spark.
>>> No classloader modifications. I am running multiple Spark batch jobs (with
>>> user supplied partitioning) and they have no problems, debug in local mode
>>> show no errors.
>>>
>>> 2015-03-26 21:47 GMT+01:00 Tathagata Das :
>>>
>>>> Here are few steps to debug.
>>>>
>>>> 1. Try using replication from a Spark job: sc.parallelize(1 to 100,
>>>> 100).persist(StorageLevel.MEMORY_ONLY_2).count()
>>>> 2. If one works, then we know that there is probably nothing wrong with
>>>> the Spark installation, and probably in the threads related to the
>>>> receivers receiving the data. Are you writing a custom receiver? Are you
>>>> somehow playing around with the class loader in the custom receiver?
>>>>
>>>> TD
>>>>
>>>>
>>>> On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola 
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am running spark streaming v 1.3.0 (running inside Docker) on Mesos
>>>>> 0.21.1. Spark streaming is started using Marathon -> docker container gets
>>>>> deployed and starts streaming (from custom Actor). Spark binary is located
>>>>> on shared GlusterFS volume. Data is streamed from Elasticsearch/Redis. 
>>>>> When
>>>>> new batch arrives Spark tries to replicate it but fails with following
>>>>> error :
>>>>>
>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840
>>>>> dropped from memory (free 278017782)
>>>>> 15/03/26 14:50:00 INFO BlockManager: Removing block broadcast_0_piece0
>>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of size
>>>>> 1658 dropped from memory (free 278019440)
>>>>> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
>>>>> broadcast_0_piece0
>>>>> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while invoking
>>>>> RpcHandler#receive() on RPC id 7178767328921933569
>>>>> java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel
>>>>> at java.lang.Class.forName0(Native Method)
>>>>> at java.lang.Class.forName(Class.java:344)
>>>>> at
>>>>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65)
>>>>> at
>>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
>>>>> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>>>>> at
>>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
>>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>>> at
>>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
>>>>> at
>>>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:88)
>>>>> at
>>>>> org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:65)
>>>>> at
>>>>> org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:124)
>>>>> at
>>>>> org.apache.spark.network.server.TransportRequestHandler.hand

Re: Strange JavaDeserialization error - java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel

2015-03-27 Thread Ondrej Smola
It happens only when StorageLevel is used with 1 replica ( StorageLevel.
MEMORY_ONLY_2,StorageLevel.MEMORY_AND_DISK_2) , StorageLevel.MEMORY_ONLY ,
StorageLevel.MEMORY_AND_DISK works - the problems must be clearly somewhere
between mesos-spark . From console I see that spark is trying to replicate
to nodes -> nodes show up in Mesos active tasks ... but they always fail
with ClassNotFoundE.

2015-03-27 0:52 GMT+01:00 Tathagata Das :

> Could you try running a simpler spark streaming program with receiver (may
> be socketStream) and see if that works.
>
> TD
>
> On Thu, Mar 26, 2015 at 2:08 PM, Ondrej Smola 
> wrote:
>
>> Hi thanks for reply,
>>
>> yes I have custom receiver -> but it has simple logic .. pop ids from
>> redis queue -> load docs based on ids from elastic and store them in spark.
>> No classloader modifications. I am running multiple Spark batch jobs (with
>> user supplied partitioning) and they have no problems, debug in local mode
>> show no errors.
>>
>> 2015-03-26 21:47 GMT+01:00 Tathagata Das :
>>
>>> Here are few steps to debug.
>>>
>>> 1. Try using replication from a Spark job: sc.parallelize(1 to 100,
>>> 100).persist(StorageLevel.MEMORY_ONLY_2).count()
>>> 2. If one works, then we know that there is probably nothing wrong with
>>> the Spark installation, and probably in the threads related to the
>>> receivers receiving the data. Are you writing a custom receiver? Are you
>>> somehow playing around with the class loader in the custom receiver?
>>>
>>> TD
>>>
>>>
>>> On Thu, Mar 26, 2015 at 10:59 AM, Ondrej Smola 
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am running spark streaming v 1.3.0 (running inside Docker) on Mesos
>>>> 0.21.1. Spark streaming is started using Marathon -> docker container gets
>>>> deployed and starts streaming (from custom Actor). Spark binary is located
>>>> on shared GlusterFS volume. Data is streamed from Elasticsearch/Redis. When
>>>> new batch arrives Spark tries to replicate it but fails with following
>>>> error :
>>>>
>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0 of size 2840
>>>> dropped from memory (free 278017782)
>>>> 15/03/26 14:50:00 INFO BlockManager: Removing block broadcast_0_piece0
>>>> 15/03/26 14:50:00 INFO MemoryStore: Block broadcast_0_piece0 of size
>>>> 1658 dropped from memory (free 278019440)
>>>> 15/03/26 14:50:00 INFO BlockManagerMaster: Updated info of block
>>>> broadcast_0_piece0
>>>> 15/03/26 14:50:00 ERROR TransportRequestHandler: Error while invoking
>>>> RpcHandler#receive() on RPC id 7178767328921933569
>>>> java.lang.ClassNotFoundException: org/apache/spark/storage/StorageLevel
>>>> at java.lang.Class.forName0(Native Method)
>>>> at java.lang.Class.forName(Class.java:344)
>>>> at
>>>> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:65)
>>>> at
>>>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
>>>> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
>>>> at
>>>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
>>>> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>>>> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>>>> at
>>>> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:68)
>>>> at
>>>> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:88)
>>>> at
>>>> org.apache.spark.network.netty.NettyBlockRpcServer.receive(NettyBlockRpcServer.scala:65)
>>>> at
>>>> org.apache.spark.network.server.TransportRequestHandler.processRpcRequest(TransportRequestHandler.java:124)
>>>> at
>>>> org.apache.spark.network.server.TransportRequestHandler.handle(TransportRequestHandler.java:97)
>>>> at
>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:91)
>>>> at
>>>> org.apache.spark.network.server.TransportChannelHandler.channelRead0(TransportChannelHandler.java:44)
>>>> at
>>>> io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:105)
>>>> at
>>>> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHa

Re: Spark SQL Thrift Server start exception : java.lang.ClassNotFoundException: org.datanucleus.api.jdo.JDOPersistenceManagerFactory

2015-03-03 Thread Anusha Shamanur
I downloaded different versions of the jars and it worked.

Thanks!

On Tue, Mar 3, 2015 at 4:45 PM, Cheng, Hao  wrote:

>  Which version / distribution are you using? Please references this blog
> that Felix C posted if you’re running on CDH.
>
>
> http://eradiating.wordpress.com/2015/02/22/getting-hivecontext-to-work-in-cdh/
>
>
>
> Or you may also need to download the datanucleus*.jar files try to add the
> option of “--jars” while starting the spark shell.
>
>
>
> *From:* Anusha Shamanur [mailto:anushas...@gmail.com]
> *Sent:* Wednesday, March 4, 2015 5:07 AM
> *To:* Cheng, Hao
> *Subject:* Re: Spark SQL Thrift Server start exception :
> java.lang.ClassNotFoundException:
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory
>
>
>
> Hi,
>
>
>
> I am getting the same error. There is no lib folder in my $SPARK_HOME. But
> I included these jars while calling spark-shell.
>
>
>
> Now, I get this:
>
> Caused by: org.datanucleus.exceptions.ClassNotResolvedException: Class
> "org.datanucleus.store.rdbms.RDBMSStoreManager" was not found in the
> CLASSPATH. Please check your specification and your CLASSPATH.
>
>at
> org.datanucleus.ClassLoaderResolverImpl.classForName(ClassLoaderResolverImpl.java:218)
>
>
>
> How do I solve this?
>
>
>
> On Mon, Mar 2, 2015 at 11:04 PM, Cheng, Hao  wrote:
>
> Copy those jars into the $SPARK_HOME/lib/
>
> datanucleus-api-jdo-3.2.6.jar
> datanucleus-core-3.2.10.jar
> datanucleus-rdbms-3.2.9.jar
>
> see
> https://github.com/apache/spark/blob/master/bin/compute-classpath.sh#L120
>
>
>
> -Original Message-
> From: fanooos [mailto:dev.fano...@gmail.com]
> Sent: Tuesday, March 3, 2015 2:50 PM
> To: user@spark.apache.org
> Subject: Spark SQL Thrift Server start exception :
> java.lang.ClassNotFoundException:
> org.datanucleus.api.jdo.JDOPersistenceManagerFactory
>
> I have installed a hadoop cluster (version : 2.6.0), apache spark (version
> :
> 1.2.1 preBuilt for hadoop 2.4 and later), and hive (version 1.0.0).
>
> When I try to start the spark sql thrift server I am getting the following
> exception.
>
> Exception in thread "main" java.lang.RuntimeException:
> java.lang.RuntimeException: Unable to instantiate
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient
> at
> org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:346)
> at
>
> org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.scala:235)
> at
>
> org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.scala:231)
> at scala.Option.orElse(Option.scala:257)
> at
> org.apache.spark.sql.hive.HiveContext.x$3$lzycompute(HiveContext.scala:231)
> at org.apache.spark.sql.hive.HiveContext.x$3(HiveContext.scala:229)
> at
>
> org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:229)
> at
> org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:229)
> at
> org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:292)
> at
> org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276)
> at
> org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:248)
> at
> org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:91)
> at
> org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:90)
> at
>
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
> at
> scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
> at org.apache.spark.sql.SQLContext.(SQLContext.scala:90)
> at
> org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:72)
> at
>
> org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:51)
> at
>
> org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:56)
> at
>
> org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at
> org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> Caused by: java.lan

RE: Spark SQL Thrift Server start exception : java.lang.ClassNotFoundException: org.datanucleus.api.jdo.JDOPersistenceManagerFactory

2015-03-03 Thread Cheng, Hao
Which version / distribution are you using? Please references this blog that 
Felix C posted if you’re running on CDH.
http://eradiating.wordpress.com/2015/02/22/getting-hivecontext-to-work-in-cdh/

Or you may also need to download the datanucleus*.jar files try to add the 
option of “--jars” while starting the spark shell.

From: Anusha Shamanur [mailto:anushas...@gmail.com]
Sent: Wednesday, March 4, 2015 5:07 AM
To: Cheng, Hao
Subject: Re: Spark SQL Thrift Server start exception : 
java.lang.ClassNotFoundException: 
org.datanucleus.api.jdo.JDOPersistenceManagerFactory

Hi,

I am getting the same error. There is no lib folder in my $SPARK_HOME. But I 
included these jars while calling spark-shell.

Now, I get this:

Caused by: org.datanucleus.exceptions.ClassNotResolvedException: Class 
"org.datanucleus.store.rdbms.RDBMSStoreManager" was not found in the CLASSPATH. 
Please check your specification and your CLASSPATH.

   at 
org.datanucleus.ClassLoaderResolverImpl.classForName(ClassLoaderResolverImpl.java:218)



How do I solve this?

On Mon, Mar 2, 2015 at 11:04 PM, Cheng, Hao 
mailto:hao.ch...@intel.com>> wrote:
Copy those jars into the $SPARK_HOME/lib/

datanucleus-api-jdo-3.2.6.jar
datanucleus-core-3.2.10.jar
datanucleus-rdbms-3.2.9.jar

see https://github.com/apache/spark/blob/master/bin/compute-classpath.sh#L120


-Original Message-
From: fanooos [mailto:dev.fano...@gmail.com<mailto:dev.fano...@gmail.com>]
Sent: Tuesday, March 3, 2015 2:50 PM
To: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Spark SQL Thrift Server start exception : 
java.lang.ClassNotFoundException: 
org.datanucleus.api.jdo.JDOPersistenceManagerFactory

I have installed a hadoop cluster (version : 2.6.0), apache spark (version :
1.2.1 preBuilt for hadoop 2.4 and later), and hive (version 1.0.0).

When I try to start the spark sql thrift server I am getting the following 
exception.

Exception in thread "main" java.lang.RuntimeException:
java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:346)
at
org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.scala:235)
at
org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.scala:231)
at scala.Option.orElse(Option.scala:257)
at
org.apache.spark.sql.hive.HiveContext.x$3$lzycompute(HiveContext.scala:231)
at org.apache.spark.sql.hive.HiveContext.x$3(HiveContext.scala:229)
at
org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:229)
at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:229)
at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:292)
at 
org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:248)
at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:91)
at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:90)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at org.apache.spark.sql.SQLContext.(SQLContext.scala:90)
at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:72)
at
org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:51)
at
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:56)
at
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1412)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:62)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
at
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.j

RE: Spark SQL Thrift Server start exception : java.lang.ClassNotFoundException: org.datanucleus.api.jdo.JDOPersistenceManagerFactory

2015-03-02 Thread Cheng, Hao
Copy those jars into the $SPARK_HOME/lib/

datanucleus-api-jdo-3.2.6.jar
datanucleus-core-3.2.10.jar
datanucleus-rdbms-3.2.9.jar

see https://github.com/apache/spark/blob/master/bin/compute-classpath.sh#L120


-Original Message-
From: fanooos [mailto:dev.fano...@gmail.com] 
Sent: Tuesday, March 3, 2015 2:50 PM
To: user@spark.apache.org
Subject: Spark SQL Thrift Server start exception : 
java.lang.ClassNotFoundException: 
org.datanucleus.api.jdo.JDOPersistenceManagerFactory

I have installed a hadoop cluster (version : 2.6.0), apache spark (version :
1.2.1 preBuilt for hadoop 2.4 and later), and hive (version 1.0.0). 

When I try to start the spark sql thrift server I am getting the following 
exception. 

Exception in thread "main" java.lang.RuntimeException:
java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:346)
at
org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.scala:235)
at
org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.scala:231)
at scala.Option.orElse(Option.scala:257)
at
org.apache.spark.sql.hive.HiveContext.x$3$lzycompute(HiveContext.scala:231)
at org.apache.spark.sql.hive.HiveContext.x$3(HiveContext.scala:229)
at
org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:229)
at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:229)
at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:292)
at 
org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:248)
at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:91)
at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:90)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at org.apache.spark.sql.SQLContext.(SQLContext.scala:90)
at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:72)
at
org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:51)
at
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:56)
at
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: Unable to instantiate 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1412)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:62)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
at
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
... 26 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
... 31 more
Caused by: javax.jdo.JDOFatalUserException: Class 
org.datanucleus.api.jdo.JDOPersistenceManagerFactory was not found.
NestedThrowables:
java.lang.ClassNotFoundException:
org.datanucleus.api.jdo.JDOPersistenceManagerFactory
at
javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1175)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
at
org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:310)
at
org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:339)

Spark SQL Thrift Server start exception : java.lang.ClassNotFoundException: org.datanucleus.api.jdo.JDOPersistenceManagerFactory

2015-03-02 Thread fanooos
I have installed a hadoop cluster (version : 2.6.0), apache spark (version :
1.2.1 preBuilt for hadoop 2.4 and later), and hive (version 1.0.0). 

When I try to start the spark sql thrift server I am getting the following
exception. 

Exception in thread "main" java.lang.RuntimeException:
java.lang.RuntimeException: Unable to instantiate
org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:346)
at
org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.scala:235)
at
org.apache.spark.sql.hive.HiveContext$$anonfun$4.apply(HiveContext.scala:231)
at scala.Option.orElse(Option.scala:257)
at
org.apache.spark.sql.hive.HiveContext.x$3$lzycompute(HiveContext.scala:231)
at org.apache.spark.sql.hive.HiveContext.x$3(HiveContext.scala:229)
at
org.apache.spark.sql.hive.HiveContext.hiveconf$lzycompute(HiveContext.scala:229)
at org.apache.spark.sql.hive.HiveContext.hiveconf(HiveContext.scala:229)
at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:292)
at 
org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:276)
at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:248)
at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:91)
at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:90)
at
scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at org.apache.spark.sql.SQLContext.(SQLContext.scala:90)
at org.apache.spark.sql.hive.HiveContext.(HiveContext.scala:72)
at
org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:51)
at
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveThriftServer2.scala:56)
at
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThriftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:358)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: Unable to instantiate
org.apache.hadoop.hive.metastore.HiveMetaStoreClient
at
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1412)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.(RetryingMetaStoreClient.java:62)
at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
at
org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
... 26 more
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at
org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
... 31 more
Caused by: javax.jdo.JDOFatalUserException: Class
org.datanucleus.api.jdo.JDOPersistenceManagerFactory was not found.
NestedThrowables:
java.lang.ClassNotFoundException:
org.datanucleus.api.jdo.JDOPersistenceManagerFactory
at
javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1175)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
at
org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:310)
at
org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:339)
at
org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:248)
at
org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223)
at 
org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
at
org.apache.hadoop.hive.metastore.RawStoreProxy.(RawStoreProxy.java:58)
at
org.apache.hadoop.hive.metastore.RawStoreProx

[graphx] failed to submit an application with java.lang.ClassNotFoundException

2014-11-27 Thread Yifan LI
Hi,

I just tried to submit an application from graphx examples directory, but it 
failed:

yifan2:bin yifanli$ MASTER=local[*] ./run-example graphx.PPR_hubs
java.lang.ClassNotFoundException: org.apache.spark.examples.graphx.PPR_hubs
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:249)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:318)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

and also,
yifan2:bin yifanli$ ./spark-submit --class 
org.apache.spark.examples.graphx.PPR_hubs 
../examples/target/scala-2.10/spark-examples-1.2.0-SNAPSHOT-hadoop1.0.4.jar
java.lang.ClassNotFoundException: org.apache.spark.examples.graphx.PPR_hubs
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:249)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:318)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

anyone has some points on this?



Best,
Yifan LI







Re: Fixed:spark 1.1.0 - hbase 0.98.6-hadoop2 version - py4j.protocol.Py4JJavaError java.lang.ClassNotFoundException

2014-10-06 Thread serkan.dogan
Thanks MLnick,

I fixed the error.

First i compile spark with original version later  I download this pom file
to examples folder 

https://github.com/tedyu/spark/commit/70fb7b4ea8fd7647e4a4ddca4df71521b749521c


Then i recompile with maven. 


mvn -Dhbase.profile=hadoop-provided -Phadoop-2.4 -Dhadoop.version=2.4.1
-DskipTests clean package 

Now everything is  ok.





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-1-0-hbase-0-98-6-hadoop2-version-py4j-protocol-Py4JJavaError-java-lang-ClassNotFoundException-tp15668p15778.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: spark 1.1.0 - hbase 0.98.6-hadoop2 version - py4j.protocol.Py4JJavaError java.lang.ClassNotFoundException

2014-10-04 Thread Nick Pentreath
/03 11:27:16 INFO SparkUI: Started SparkUI at
>> http://1-1-1-1-1.rev.mydomain.io:4040
>> 14/10/03 11:27:16 INFO Utils: Copying
>>
>> /home/downloads/spark/spark-1.1.0/./examples/src/main/python/hbase_inputformat.py
>> to /tmp/spark-7232227a-0547-454e-9f68-805fa7b0c2f0/hbase_inputformat.py
>> 14/10/03 11:27:16 INFO SparkContext: Added file
>>
>> file:/home/downloads/spark/spark-1.1.0/./examples/src/main/python/hbase_inputformat.py
>> at http://1.1.1.1:49611/files/hbase_inputformat.py with timestamp
>> 1412324836837
>> 14/10/03 11:27:16 INFO AkkaUtils: Connecting to HeartbeatReceiver:
>> akka.tcp://
>> sparkdri...@1-1-1-1-1.rev.mydomain.io:49256/user/HeartbeatReceiver
>> Traceback (most recent call last):
>>   File
>>
>> "/home/downloads/spark/spark-1.1.0/./examples/src/main/python/hbase_inputformat.py",
>> line 70, in 
>> conf=conf)
>>   File "/home/downloads/spark/spark-1.1.0/python/pyspark/context.py", line
>> 471, in newAPIHadoopRDD
>> jconf, batchSize)
>>   File
>>
>> "/usr/lib/python2.6/site-packages/py4j-0.8.2.1-py2.6.egg/py4j/java_gateway.py",
>> line 538, in __call__
>> self.target_id, self.name)
>>   File
>>
>> "/usr/lib/python2.6/site-packages/py4j-0.8.2.1-py2.6.egg/py4j/protocol.py",
>> line 300, in get_return_value
>> format(target_id, '.', name), value)
>> py4j.protocol.Py4JJavaError: An error occurred while calling
>> z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
>> : java.lang.ClassNotFoundException:
>> org.apache.hadoop.hbase.io.ImmutableBytesWritable
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:270)
>> at org.apache.spark.util.Utils$.classForName(Utils.scala:150)
>> at
>>
>> org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDDFromClassNames(PythonRDD.scala:451)
>> at
>>
>> org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDD(PythonRDD.scala:436)
>> at
>> org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD(PythonRDD.scala)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>> at
>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:622)
>> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
>> at
>> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
>> at py4j.Gateway.invoke(Gateway.java:259)
>> at
>> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
>> at py4j.commands.CallCommand.execute(CallCommand.java:79)
>> at py4j.GatewayConnection.run(GatewayConnection.java:207)
>> at java.lang.Thread.run(Thread.java:701)
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-1-0-hbase-0-98-6-hadoop2-version-py4j-protocol-Py4JJavaError-java-lang-ClassNotFoundException-tp15668.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


spark 1.1.0 - hbase 0.98.6-hadoop2 version - py4j.protocol.Py4JJavaError java.lang.ClassNotFoundException

2014-10-03 Thread serkan.dogan
rotocol.Py4JJavaError: An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.io.ImmutableBytesWritable
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.spark.util.Utils$.classForName(Utils.scala:150)
at
org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDDFromClassNames(PythonRDD.scala:451)
at
org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDD(PythonRDD.scala:436)
at
org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD(PythonRDD.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:259)
at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:701)



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-1-0-hbase-0-98-6-hadoop2-version-py4j-protocol-Py4JJavaError-java-lang-ClassNotFoundException-tp15668.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



spark 1.1.0 - hbase 0.98.6-hadoop2 version - py4j.protocol.Py4JJavaError java.lang.ClassNotFoundException

2014-10-03 Thread serkan.dogan
   format(target_id, '.', name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling
z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD.
: java.lang.ClassNotFoundException:
org.apache.hadoop.hbase.io.ImmutableBytesWritable
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at org.apache.spark.util.Utils$.classForName(Utils.scala:150)
at
org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDDFromClassNames(PythonRDD.scala:451)
at
org.apache.spark.api.python.PythonRDD$.newAPIHadoopRDD(PythonRDD.scala:436)
at
org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD(PythonRDD.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:622)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
at py4j.Gateway.invoke(Gateway.java:259)
at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:207)
at java.lang.Thread.run(Thread.java:701)



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-1-0-hbase-0-98-6-hadoop2-version-py4j-protocol-Py4JJavaError-java-lang-ClassNotFoundException-tp15666.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: java.lang.ClassNotFoundException on driver class in executor

2014-09-23 Thread Barrington Henry
Hi Andrew,

Thanks for the prompt response. I tried command line and it works fine. But, I 
want to try from IDE for easier debugging and transparency into code execution. 
I would try and see if there is any way to get the jar over to the executor 
from within the IDE.

- Barrington

> On Sep 21, 2014, at 10:52 PM, Andrew Or  wrote:
> 
> Hi Barrington,
> 
> Have you tried running it from the command line? (i.e. bin/spark-submit 
> --master yarn-client --class YOUR_CLASS YOUR_JAR) Does it still fail? I am 
> not super familiar with running Spark through intellij, but the AFAIK the 
> classpaths are setup a little differently there. Also, Spark submit does this 
> for you nicely, so if you go through this path you don't even have to call 
> `setJars` as you did in your application.
> 
> -Andrew
> 
> 2014-09-21 12:52 GMT-07:00 Barrington Henry  <mailto:barrington.he...@me.com>>:
> Hi,
> 
> I am running spark from my IDE (InteliJ) using YARN as my cluster manager. 
> However, the executor node is not able to find my main driver class 
> “LascoScript”. I keep getting  java.lang.ClassNotFoundException.
> I tried adding  the jar of the main class by running the snippet below
> 
> 
>val conf = new SparkConf().set("spark.driver.host", "barrymac")
>   .setMaster("yarn-client")
>   .setAppName("Lasco Script”)
>   
> .setJars(SparkContext.jarOfClass(this.getClass).toSeq)
> 
> But the jarOfClass function returns nothing. See below for logs.
> 
> 
> 
> 14/09/21 10:53:15 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 
> barrymac): java.lang.ClassNotFoundException: LascoScript$$anonfun$1
> java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> java.security.AccessController.doPrivileged(Native Method)
> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> java.lang.ClassLoader.loadClass(ClassLoader.java:423)
> java.lang.ClassLoader.loadClass(ClassLoader.java:356)
> java.lang.Class.forName0(Native Method)
> java.lang.Class.forName(Class.java:264)
> 
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:59)
> 
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1593)
> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514)
> 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1750)
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
> 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1964)
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1888)
> 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
> 
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1964)
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1888)
> 
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
> 
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
> 
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:57)
> org.apache.spark.scheduler.Task.run(Task.scala:54)
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
> 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> java.lang.Thread.run(Thread.java:722)
> 14/09/21 10:53:15 INFO TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) on 
> executor barrymac: java.lang.ClassNotFoundException (LascoScript$$anonfun$1) 
> [duplicate 1]
> 14/09/21 10:53:15 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID 4, 
> barrymac, NODE_LOCAL, 1312 bytes)
> 14/09/21 10:53:15 INFO TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2) on 
> executor barrymac: java.lang.ClassNotFoundException (LascoScript$$anonfun$1) 
> [duplicate 2]
> 14/09/21 10:53:15 INFO TaskSetManager: Starting task 2.1 in stage 0.0 (TID 5, 
> barrymac, NODE_LOCAL, 1312 bytes)
> 14/09/21 10:53:15 INFO TaskSetManager: Lost task 3.0 in st

Re: java.lang.ClassNotFoundException on driver class in executor

2014-09-21 Thread Andrew Or
Hi Barrington,

Have you tried running it from the command line? (i.e. bin/spark-submit
--master yarn-client --class YOUR_CLASS YOUR_JAR) Does it still fail? I am
not super familiar with running Spark through intellij, but the AFAIK the
classpaths are setup a little differently there. Also, Spark submit does
this for you nicely, so if you go through this path you don't even have to
call `setJars` as you did in your application.

-Andrew

2014-09-21 12:52 GMT-07:00 Barrington Henry :

> Hi,
>
> I am running spark from my IDE (InteliJ) using YARN as my cluster manager.
> However, the executor node is not able to find my main driver class
> “LascoScript”. I keep getting  java.lang.ClassNotFoundException.
> I tried adding  the jar of the main class by running the snippet below
>
>
>val conf = new SparkConf().set("spark.driver.host", "barrymac")
>   .setMaster("yarn-client")
>   .setAppName("Lasco Script”)
>
> .setJars(SparkContext.jarOfClass(this.getClass).toSeq)
>
> But the jarOfClass function returns nothing. See below for logs.
>
> 
>
> 14/09/21 10:53:15 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0,
> barrymac): java.lang.ClassNotFoundException: LascoScript$$anonfun$1
> java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> java.security.AccessController.doPrivileged(Native Method)
> java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> java.lang.ClassLoader.loadClass(ClassLoader.java:423)
> java.lang.ClassLoader.loadClass(ClassLoader.java:356)
> java.lang.Class.forName0(Native Method)
> java.lang.Class.forName(Class.java:264)
>
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:59)
>
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1593)
>
> java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514)
>
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1750)
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
>
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1964)
>
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1888)
>
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
>
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1964)
>
> java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1888)
>
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
> java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
> java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)
>
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)
>
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:57)
> org.apache.spark.scheduler.Task.run(Task.scala:54)
>
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> java.lang.Thread.run(Thread.java:722)
> 14/09/21 10:53:15 INFO TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1)
> on executor barrymac: java.lang.ClassNotFoundException
> (LascoScript$$anonfun$1) [duplicate 1]
> 14/09/21 10:53:15 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID
> 4, barrymac, NODE_LOCAL, 1312 bytes)
> 14/09/21 10:53:15 INFO TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2)
> on executor barrymac: java.lang.ClassNotFoundException
> (LascoScript$$anonfun$1) [duplicate 2]
> 14/09/21 10:53:15 INFO TaskSetManager: Starting task 2.1 in stage 0.0 (TID
> 5, barrymac, NODE_LOCAL, 1312 bytes)
> 14/09/21 10:53:15 INFO TaskSetManager: Lost task 3.0 in stage 0.0 (TID 3)
> on executor barrymac: java.lang.ClassNotFoundException
> (LascoScript$$anonfun$1) [duplicate 3]
> 14/09/21 10:53:15 INFO TaskSetManager: Starting task 3.1 in stage 0.0 (TID
> 6, barrymac, NODE_LOCAL, 1312 bytes)
> 14/09/21 10:53:15 INFO TaskSetManager: Lost task 1.1 in stage 0.0 (TID 4)
> on executor barrymac: java.lang.ClassNotFoundException
> (LascoScript$$anonfun$1) [duplicate 4]
> 14/09/21 10:53:15 INFO TaskSetManager: Starting task 1.2 in stage 0.0 (TID
> 7, barrymac, NODE_LOCAL, 1312 bytes)
> 14/09/21 10:53:15 INFO TaskSetManager: Lost task 2.1 in s

java.lang.ClassNotFoundException on driver class in executor

2014-09-21 Thread Barrington Henry
Hi,

I am running spark from my IDE (InteliJ) using YARN as my cluster manager. 
However, the executor node is not able to find my main driver class 
“LascoScript”. I keep getting  java.lang.ClassNotFoundException.
I tried adding  the jar of the main class by running the snippet below


   val conf = new SparkConf().set("spark.driver.host", "barrymac")
  .setMaster("yarn-client")
  .setAppName("Lasco Script”)
  
.setJars(SparkContext.jarOfClass(this.getClass).toSeq)

But the jarOfClass function returns nothing. See below for logs.



14/09/21 10:53:15 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 
barrymac): java.lang.ClassNotFoundException: LascoScript$$anonfun$1
java.net.URLClassLoader$1.run(URLClassLoader.java:366)
java.net.URLClassLoader$1.run(URLClassLoader.java:355)
java.security.AccessController.doPrivileged(Native Method)
java.net.URLClassLoader.findClass(URLClassLoader.java:354)
java.lang.ClassLoader.loadClass(ClassLoader.java:423)
java.lang.ClassLoader.loadClass(ClassLoader.java:356)
java.lang.Class.forName0(Native Method)
java.lang.Class.forName(Class.java:264)

org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:59)
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1593)
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1514)

java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1750)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1964)
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1888)

java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1964)
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1888)

java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1347)
java.io.ObjectInputStream.readObject(ObjectInputStream.java:369)

org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:62)

org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:87)
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:57)
org.apache.spark.scheduler.Task.run(Task.scala:54)
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)

java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
java.lang.Thread.run(Thread.java:722)
14/09/21 10:53:15 INFO TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) on 
executor barrymac: java.lang.ClassNotFoundException (LascoScript$$anonfun$1) 
[duplicate 1]
14/09/21 10:53:15 INFO TaskSetManager: Starting task 1.1 in stage 0.0 (TID 4, 
barrymac, NODE_LOCAL, 1312 bytes)
14/09/21 10:53:15 INFO TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2) on 
executor barrymac: java.lang.ClassNotFoundException (LascoScript$$anonfun$1) 
[duplicate 2]
14/09/21 10:53:15 INFO TaskSetManager: Starting task 2.1 in stage 0.0 (TID 5, 
barrymac, NODE_LOCAL, 1312 bytes)
14/09/21 10:53:15 INFO TaskSetManager: Lost task 3.0 in stage 0.0 (TID 3) on 
executor barrymac: java.lang.ClassNotFoundException (LascoScript$$anonfun$1) 
[duplicate 3]
14/09/21 10:53:15 INFO TaskSetManager: Starting task 3.1 in stage 0.0 (TID 6, 
barrymac, NODE_LOCAL, 1312 bytes)
14/09/21 10:53:15 INFO TaskSetManager: Lost task 1.1 in stage 0.0 (TID 4) on 
executor barrymac: java.lang.ClassNotFoundException (LascoScript$$anonfun$1) 
[duplicate 4]
14/09/21 10:53:15 INFO TaskSetManager: Starting task 1.2 in stage 0.0 (TID 7, 
barrymac, NODE_LOCAL, 1312 bytes)
14/09/21 10:53:15 INFO TaskSetManager: Lost task 2.1 in stage 0.0 (TID 5) on 
executor barrymac: java.lang.ClassNotFoundException (LascoScript$$anonfun$1) 
[duplicate 5]
14/09/21 10:53:15 INFO TaskSetManager: Starting task 2.2 in stage 0.0 (TID 8, 
barrymac, NODE_LOCAL, 1312 bytes)
14/09/21 10:53:15 INFO TaskSetManager: Lost task 3.1 in stage 0.0 (TID 6) on 
executor barrymac: java.lang.ClassNotFoundException (LascoScript$$anonfun$1) 
[duplicate 6]
14/09/21 10:53:15 INFO TaskSetManager: Starting task 3.2 in stage 0.0 (TID 9, 
barrymac, NODE_LOCAL, 1312 bytes)
14/09/21 10:53:15 INFO TaskSetManager: Lost task 1.2 in stage 0.0 (TID 7) on 
executor barrymac: java.lang.ClassNotFoundException (LascoScript$$anonfun$1) 
[duplicate 7]
14/09/21 10:53:15 INFO TaskSetManager: Starting task 1.3 in stage

Exception failure: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaReceiver

2014-05-30 Thread Margusja
 INFO TaskSchedulerImpl: Adding task set 6.0 with 1 tasks
14/05/30 11:53:56 INFO TaskSetManager: Starting task 6.0:0 as TID 72 on 
executor 0: dlvm1 (PROCESS_LOCAL)
14/05/30 11:53:56 INFO TaskSetManager: Serialized task 6.0:0 as 1958 
bytes in 0 ms
14/05/30 11:53:56 INFO DAGScheduler: Got job 4 (runJob at 
NetworkInputTracker.scala:182) with 1 output partitions (allowLocal=false)
14/05/30 11:53:56 INFO DAGScheduler: Final stage: Stage 8 (runJob at 
NetworkInputTracker.scala:182)

14/05/30 11:53:56 INFO DAGScheduler: Parents of final stage: List()
14/05/30 11:53:56 INFO DAGScheduler: Missing parents: List()
14/05/30 11:53:56 INFO DAGScheduler: Submitting Stage 8 
(ParallelCollectionRDD[0] at makeRDD at NetworkInputTracker.scala:165), 
which has no missing parents
14/05/30 11:53:56 INFO MapOutputTrackerMasterActor: Asked to send map 
output locations for shuffle 2 to spark@dlvm1:48363
14/05/30 11:53:56 INFO MapOutputTrackerMaster: Size of output statuses 
for shuffle 2 is 82 bytes
14/05/30 11:53:56 INFO TaskSetManager: Finished TID 72 in 37 ms on dlvm1 
(progress: 1/1)
14/05/30 11:53:56 INFO TaskSchedulerImpl: Removed TaskSet 6.0, whose 
tasks have all completed, from pool
14/05/30 11:53:56 INFO DAGScheduler: Submitting 1 missing tasks from 
Stage 8 (ParallelCollectionRDD[0] at makeRDD at 
NetworkInputTracker.scala:165)

14/05/30 11:53:56 INFO TaskSchedulerImpl: Adding task set 8.0 with 1 tasks
14/05/30 11:53:56 INFO TaskSetManager: Starting task 8.0:0 as TID 73 on 
executor 0: dlvm1 (PROCESS_LOCAL)
14/05/30 11:53:56 INFO TaskSetManager: Serialized task 8.0:0 as 2975 
bytes in 1 ms

14/05/30 11:53:56 INFO DAGScheduler: Completed ResultTask(6, 0)
14/05/30 11:53:56 INFO DAGScheduler: Stage 6 (take at DStream.scala:586) 
finished in 0.051 s
14/05/30 11:53:56 INFO SparkContext: Job finished: take at 
DStream.scala:586, took 0.087153883 s

14/05/30 11:53:56 INFO SparkContext: Starting job: take at DStream.scala:586
14/05/30 11:53:56 INFO DAGScheduler: Got job 5 (take at 
DStream.scala:586) with 1 output partitions (allowLocal=true)
14/05/30 11:53:56 INFO DAGScheduler: Final stage: Stage 9 (take at 
DStream.scala:586)

14/05/30 11:53:56 INFO DAGScheduler: Parents of final stage: List(Stage 10)
14/05/30 11:53:56 INFO DAGScheduler: Missing parents: List()
14/05/30 11:53:56 INFO DAGScheduler: Submitting Stage 9 
(MapPartitionsRDD[19] at combineByKey at ShuffledDStream.scala:42), 
which has no missing parents
14/05/30 11:53:56 INFO DAGScheduler: Submitting 1 missing tasks from 
Stage 9 (MapPartitionsRDD[19] at combineByKey at ShuffledDStream.scala:42)

14/05/30 11:53:56 INFO TaskSchedulerImpl: Adding task set 9.0 with 1 tasks
14/05/30 11:53:56 INFO TaskSetManager: Starting task 9.0:0 as TID 74 on 
executor 0: dlvm1 (PROCESS_LOCAL)
14/05/30 11:53:56 INFO TaskSetManager: Serialized task 9.0:0 as 1958 
bytes in 0 ms

14/05/30 11:53:56 WARN TaskSetManager: Lost TID 73 (task 8.0:0)
14/05/30 11:53:56 WARN TaskSetManager: Loss was due to 
java.lang.ClassNotFoundException
java.lang.ClassNotFoundException: 
org.apache.spark.streaming.kafka.KafkaReceiver

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)
at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:37)
at 
java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
at 
java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
at 
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)

at java.io.ObjectInputStream.readArray(ObjectInputStream.java:1706)
at 
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at 
java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
at 
java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
at 
java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
at 
java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
at 
java.io.ObjectInputStream.defaultReadObject(ObjectInputStream.java:500)
at 
org.apache.spark.rdd.ParallelCollectionPartition.readObject(ParallelCollectionRDD.scala:72)

at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.i

Re: java.lang.ClassNotFoundException

2014-05-12 Thread Archit Thakur
Hi Joe,

Your messages are going into spam folder for me.

Thx, Archit_Thakur.


On Fri, May 2, 2014 at 9:22 AM, Joe L  wrote:

> Hi, You should include the jar file of your project. for example:
> conf.set("yourjarfilepath.jar")
>
> Joe
>   On Friday, May 2, 2014 7:39 AM, proofmoore [via Apache Spark User List]
> <[hidden email]> wrote:
>   HelIo. I followed "A Standalone App in Java" part of the tutorial
> https://spark.apache.org/docs/0.8.1/quick-start.html
>
> Spark standalone cluster looks it's running without a problem :
> http://i.stack.imgur.com/7bFv8.png
>
> I have built a fat jar for running this JavaApp on the cluster. Before
> maven package:
>
> find .
>
> ./pom.xml
> ./src
> ./src/main
> ./src/main/java
> ./src/main/java/SimpleApp.java
>
>
> content of SimpleApp.java is :
>
>  import org.apache.spark.api.java.*;
>  import org.apache.spark.api.java.function.Function;
>  import org.apache.spark.SparkConf;
>  import org.apache.spark.SparkContext;
>
>
>  public class SimpleApp {
>  public static void main(String[] args) {
>
>  SparkConf conf =  new SparkConf()
>.setMaster("spark://10.35.23.13:7077")
>.setAppName("My app")
>.set("spark.executor.memory", "1g");
>
>  JavaSparkContext   sc = new JavaSparkContext (conf);
>  String logFile = "/home/ubuntu/spark-0.9.1/test_data";
>  JavaRDD logData = sc.textFile(logFile).cache();
>
>  long numAs = logData.filter(new Function() {
>   public Boolean call(String s) { return s.contains("a"); }
>  }).count();
>
>  System.out.println("Lines with a: " + numAs);
>  }
>  }
>
> This program only works when master is set as setMaster("local").
> Otherwise I get this error : http://i.stack.imgur.com/doRSn.png
>
> Thanks,
> Ibrahim
>
>
> --
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-ClassNotFoundException-tp5191.html
>  To start a new topic under Apache Spark User List, email [hidden email]
> To unsubscribe from Apache Spark User List, click here.
> NAML
>
>
>
> --
> View this message in context: Re: java.lang.ClassNotFoundException
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>


Re: java.lang.ClassNotFoundException - spark on mesos

2014-05-02 Thread bo...@shopify.com
I have opened a PR for discussion on the apache/spark repository
https://github.com/apache/spark/pull/620

There is certainly a classLoader problem in the way Mesos and Spark operate,
I'm not sure what caused it to suddenly stop working so I'd like to open the
discussion there



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-ClassNotFoundException-spark-on-mesos-tp3510p5245.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


RE: java.lang.ClassNotFoundException

2014-05-02 Thread İbrahim Rıza HALLAÇ
Things I tried and the errors are :
String path = 
"/home/ubuntu/spark-0.9.1/SimpleApp/target/simple-project-1.0-allinone.jar";..  
.set(path)
$mvn package[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) 
on project simple-project: Compilation failure[ERROR] 
/home/ubuntu/spark-0.9.1/SimpleApp/src/main/java/SimpleApp.java:[14,23] error: 
method set in class SparkConf cannot be applied to given types;[ERROR] -> [Help 
1]

String path = 
"/home/ubuntu/spark-0.9.1/SimpleApp/target/simple-project-1.0-allinone.jar";..  
.setJars(path)
$mvn package[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.0.2:compile (default-compile) 
on project simple-project: Compilation failure[ERROR] 
/home/ubuntu/spark-0.9.1/SimpleApp/src/main/java/SimpleApp.java:[14,23] error: 
no suitable method found for setJars(String)

content of my pom.xml file (maybe the problem arises here)
org.apache.maven.plugins 
maven-compiler-plugin
1.7   1.7
  org.apache.maven.plugins  
maven-shade-plugin  1.5  
package
  shade  
true  
allinone  
  *:*   
   
*:*  
META-INF/*.SF
META-INF/*.DSA
META-INF/*.RSA   
 
reference.conf   
   
META-INF/spring.handlers
  
META-INF/spring.schemas

  
com.echoed.chamber.Main  
  

  edu.berkeley  
simple-project  4.0.0  
Simple Project  jar  1.0 
   Akka repository  
http://repo.akka.io/releases
   
org.apache.spark  
spark-core_2.10  0.9.1
  

Date: Thu, 1 May 2014 20:52:59 -0700
From: selme...@yahoo.com
To: u...@spark.incubator.apache.org
Subject: Re: java.lang.ClassNotFoundException

Hi, You should include the jar file of your project. for example: 
conf.set("yourjarfilepath.jar")
Joe On Friday, May 2, 2014 7:39 AM, proofmoore [via Apache Spark User List] 
<[hidden email]> wrote:








HelIo. I followed "A Standalone App in Java" part of the tutorial 
https://spark.apache.org/docs/0.8.1/quick-start.html
Spark standalone cluster looks it's running without a problem : 
http://i.stack.imgur.com/7bFv8.png
I have built a fat jar for running this JavaApp on the cluster. Before maven 
package:find ../pom.xml./src./src/main
./src/main/java./src/main/java/SimpleApp.java

content of SimpleApp.java is :
 import org.apache.spark.api.java.*; import 
org.apache.spark.api.java.function.Function; import 
org.apache.spark.SparkConf; import org.apache.spark.SparkContext;

 public class SimpleApp { public static
 void main(String[] args) {
 SparkConf conf =  new SparkConf()   
.setMaster("spark://10.35.23.13:7077")   .setAppName("My 
app")   .set("spark.executor.memory", "1g");
 JavaSparkContext   sc = new JavaSparkContext (conf); String logFile = 
"/home/ubuntu/spark-0.9.1/test_data"; JavaRDD logData = 
sc.textFile(logFile).cache();
 long numAs = logData.filter(new Function() {  public 
Boolean call(String s) { return s.contains("a");
 } }).count();
 System.out.println("Lines with a: " + numAs);  } } This program 
only works when master is set as setMaster("local"). Otherwise I get this error 
: http://i.stack.imgur.com/doRSn.png
Thanks,Ibrahim
  











If you reply to this email, your message will be added to the 
discussion below:

http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-ClassNotFoundException-tp5191.html


To start a new topic under Apache Spark User List, email 
[hidden email] 

To unsubscribe from Apache Spark User List, click here.

NAML


  






View this message in context: Re: java.lang.ClassNotFoundException

Sent from the Apache Spark User List mailing list archive at Nabble.com.
  

Re: java.lang.ClassNotFoundException

2014-05-01 Thread Joe L
Hi, You should include the jar file of your project. for example: 
conf.set("yourjarfilepath.jar")

Joe
On Friday, May 2, 2014 7:39 AM, proofmoore [via Apache Spark User List] 
 wrote:
 
HelIo. I followed "A Standalone App in Java" part of the tutorial 
https://spark.apache.org/docs/0.8.1/quick-start.html

Spark standalone cluster looks it's running without a problem : 
http://i.stack.imgur.com/7bFv8.png

I have built a fat jar for running this JavaApp on the cluster. Before maven 
package: 
   
    find .
    
    ./pom.xml
    ./src
    ./src/main
    ./src/main/java
    ./src/main/java/SimpleApp.java


content of SimpleApp.java is :

     import org.apache.spark.api.java.*;
     import org.apache.spark.api.java.function.Function;
     import org.apache.spark.SparkConf;
     import org.apache.spark.SparkContext;


     public class SimpleApp {
     public static void main(String[] args) {

     SparkConf conf =  new SparkConf()
                       .setMaster("spark://10.35.23.13:7077")
                       .setAppName("My app")
                       .set("spark.executor.memory", "1g");

     JavaSparkContext   sc = new JavaSparkContext (conf);
     String logFile = "/home/ubuntu/spark-0.9.1/test_data";
     JavaRDD logData = sc.textFile(logFile).cache();

     long numAs = logData.filter(new Function() {
      public Boolean call(String s) { return s.contains("a"); }
     }).count();

     System.out.println("Lines with a: " + numAs); 
     }
     }
 
This program only works when master is set as setMaster("local"). Otherwise I 
get this error : http://i.stack.imgur.com/doRSn.png

Thanks,
Ibrahim


 
If you reply to this email, your message will be added to the discussion 
below:http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-ClassNotFoundException-tp5191.html
 
To start a new topic under Apache Spark User List, email 
ml-node+s1001560n1...@n3.nabble.com 
To unsubscribe from Apache Spark User List, click here.
NAML



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-ClassNotFoundException-tp5191p5203.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

java.lang.ClassNotFoundException

2014-05-01 Thread İbrahim Rıza HALLAÇ



HelIo. I followed "A Standalone App in Java" part of the tutorial 
https://spark.apache.org/docs/0.8.1/quick-start.html
Spark standalone cluster looks it's running without a problem : 
http://i.stack.imgur.com/7bFv8.png
I have built a fat jar for running this JavaApp on the cluster. Before maven 
package:find ../pom.xml./src./src/main
./src/main/java./src/main/java/SimpleApp.java

content of SimpleApp.java is :
 import org.apache.spark.api.java.*; import 
org.apache.spark.api.java.function.Function; import 
org.apache.spark.SparkConf; import org.apache.spark.SparkContext;

 public class SimpleApp { public static void main(String[] args) {
 SparkConf conf =  new SparkConf()   
.setMaster("spark://10.35.23.13:7077")   .setAppName("My 
app")   .set("spark.executor.memory", "1g");
 JavaSparkContext   sc = new JavaSparkContext (conf); String logFile = 
"/home/ubuntu/spark-0.9.1/test_data"; JavaRDD logData = 
sc.textFile(logFile).cache();
 long numAs = logData.filter(new Function() {  public 
Boolean call(String s) { return s.contains("a"); } }).count();
 System.out.println("Lines with a: " + numAs);  } } This program 
only works when master is set as setMaster("local"). Otherwise I get this error 
: http://i.stack.imgur.com/doRSn.png
Thanks,Ibrahim
  

Re: java.lang.ClassNotFoundException - spark on mesos

2014-04-02 Thread Bharath Bhushan
I tried several things in order to get 1.0.0 git tree to work with 
mesos. All my efforts failed. I could run spark 0.9.0 on mesos but not 
spark 1.0.0. Please suggest any other things I can try.


1. Change project/SparkBuild.scala to use mesos 0.17.0 and then 
make_distribution.sh.


2. Try building with maven. Threw the following error:
[INFO] Spark Project Parent POM .. SUCCESS 
[7:55.796s]

[INFO] Spark Project Core  FAILURE [7.209s]
...
[ERROR] Plugin org.apache.maven.plugins:maven-compiler-plugin:3.1 or one 
of its
dependencies could not be resolved: Failed to read artifact descriptor 
for org.a
pache.maven.plugins:maven-compiler-plugin:jar:3.1: Could not find 
artifact org.a

pache:apache:pom:13 -> [Help 1]

3. Change pom.xml to use mesos 0.17.0 and protobuf 2.5.0 and then 
make_distribution.sh


4. Change pom.xml to use mesos 0.17.0 and leave protobuf at 2.4.1 and 
then make_distribution.sh


5. Change pom.xml to use mesos 0.17.0 and remove the protobuf line and 
then make_distribution.sh


Question: Is the pom.xml modification picked up when building with sbt 
and maven or only when building with maven?


Thanks

On 01/04/14 11:04 am, Bharath Bhushan wrote:

Another problem I noticed is that the current 1.0.0 git tree still gives me the 
ClassNotFoundException. I see that the SPARK-1052 is already fixed there. I 
then modified the pom.xml for mesos and protobuf and that still gave the 
ClassNotFoundException. I also tried modifying pom.xml only for mesos and that 
fails too. So I have no way of running the 1.0.0 git tree spark on mesos yet.

Thanks.

On 01-Apr-2014, at 3:28 am, deric  wrote:


Which repository do you use?

The issue should be fixed in 0.9.1 and 1.0.0

https://spark-project.atlassian.net/browse/SPARK-1052


There's an old repository

https://github.com/apache/incubator-spark

and as Spark become one of top level projects, it was moved to new repo:

https://github.com/apache/spark

The 0.9.1 version hasn't been released yet, so you should get it from the
new git repo.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-ClassNotFoundException-spark-on-mesos-tp3510p3551.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.




Re: java.lang.ClassNotFoundException - spark on mesos

2014-03-31 Thread Bharath Bhushan
Another problem I noticed is that the current 1.0.0 git tree still gives me the 
ClassNotFoundException. I see that the SPARK-1052 is already fixed there. I 
then modified the pom.xml for mesos and protobuf and that still gave the 
ClassNotFoundException. I also tried modifying pom.xml only for mesos and that 
fails too. So I have no way of running the 1.0.0 git tree spark on mesos yet.

Thanks.

On 01-Apr-2014, at 3:28 am, deric  wrote:

> Which repository do you use?
> 
> The issue should be fixed in 0.9.1 and 1.0.0
> 
> https://spark-project.atlassian.net/browse/SPARK-1052
>   
> 
> There's an old repository 
> 
> https://github.com/apache/incubator-spark
> 
> and as Spark become one of top level projects, it was moved to new repo:
> 
> https://github.com/apache/spark
> 
> The 0.9.1 version hasn't been released yet, so you should get it from the
> new git repo.
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-ClassNotFoundException-spark-on-mesos-tp3510p3551.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.



Re: java.lang.ClassNotFoundException - spark on mesos

2014-03-31 Thread Bharath Bhushan
I was talking about the protobuf version issue as not fixed. I could not find 
any reference to the problem or the fix.

Reg. SPARK-1052, I could pull in the fix into my 0.9.0 tree (from the tar ball 
on the website) and I see the fix in the latest git.

Thanks

On 01-Apr-2014, at 3:28 am, deric  wrote:

> Which repository do you use?
> 
> The issue should be fixed in 0.9.1 and 1.0.0
> 
> https://spark-project.atlassian.net/browse/SPARK-1052
>   
> 
> There's an old repository 
> 
> https://github.com/apache/incubator-spark
> 
> and as Spark become one of top level projects, it was moved to new repo:
> 
> https://github.com/apache/spark
> 
> The 0.9.1 version hasn't been released yet, so you should get it from the
> new git repo.
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-ClassNotFoundException-spark-on-mesos-tp3510p3551.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.



Re: java.lang.ClassNotFoundException - spark on mesos

2014-03-31 Thread deric
Which repository do you use?

The issue should be fixed in 0.9.1 and 1.0.0

https://spark-project.atlassian.net/browse/SPARK-1052
  

There's an old repository 

https://github.com/apache/incubator-spark

and as Spark become one of top level projects, it was moved to new repo:

https://github.com/apache/spark

The 0.9.1 version hasn't been released yet, so you should get it from the
new git repo.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-ClassNotFoundException-spark-on-mesos-tp3510p3551.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.


Re: java.lang.ClassNotFoundException - spark on mesos

2014-03-31 Thread Bharath Bhushan
Your suggestion took me past the ClassNotFoundException. I then hit 
akka.actor.ActorNotFound exception. I patched in PR 568 into my 0.9.0 spark 
codebase and everything worked.

So thanks a lot, Tim. Is there a JIRA/PR for the protobuf issue? Why is it not 
fixed in the latest git tree?

Thanks.

On 31-Mar-2014, at 11:30 pm, Tim St Clair  wrote:

> It sounds like the protobuf issue. 
> 
> So FWIW, You might want to try updating the 0.9.0 w/pom mods for mesos & 
> protobuf. 
> 
> mesos 0.17.0 & protobuf 2.5   
> 
> Cheers,
> Tim
> 
> - Original Message -
>> From: "Bharath Bhushan" 
>> To: user@spark.apache.org
>> Sent: Monday, March 31, 2014 9:46:32 AM
>> Subject: Re: java.lang.ClassNotFoundException - spark on mesos
>> 
>> I tried 0.9.0 and the latest git tree of spark. For mesos, I tried 0.17.0 and
>> the latest git tree.
>> 
>> Thanks
>> 
>> 
>> On 31-Mar-2014, at 7:24 pm, Tim St Clair  wrote:
>> 
>>> What versions are you running?
>>> 
>>> There is a known protobuf 2.5 mismatch, depending on your versions.
>>> 
>>> Cheers,
>>> Tim
>>> 
>>> - Original Message -
>>>> From: "Bharath Bhushan" 
>>>> To: user@spark.apache.org
>>>> Sent: Monday, March 31, 2014 8:16:19 AM
>>>> Subject: java.lang.ClassNotFoundException - spark on mesos
>>>> 
>>>> I am facing different kinds of java.lang.ClassNotFoundException when
>>>> trying
>>>> to run spark on mesos. One error has to do with
>>>> org.apache.spark.executor.MesosExecutorBackend. Another has to do with
>>>> org.apache.spark.serializer.JavaSerializer. I see other people complaining
>>>> about similar issues.
>>>> 
>>>> I tried with different version of spark distribution - 0.9.0 and
>>>> 1.0.0-SNAPSHOT and faced the same problem. I think the reason for this is
>>>> is
>>>> related to the error below.
>>>> 
>>>> $ jar -xf spark-assembly_2.10-0.9.0-incubating-hadoop2.2.0.jar
>>>> java.io.IOException: META-INF/license : could not create directory
>>>>   at sun.tools.jar.Main.extractFile(Main.java:907)
>>>>   at sun.tools.jar.Main.extract(Main.java:850)
>>>>   at sun.tools.jar.Main.run(Main.java:240)
>>>>   at sun.tools.jar.Main.main(Main.java:1147)
>>>> 
>>>> This error happens with all the jars that I created. But the classes that
>>>> are
>>>> already generated is different in the different cases. If JavaSerializer
>>>> is
>>>> not already extracted before encountering META-INF/license, then that
>>>> class
>>>> is not found during execution. If MesosExecutorBackend is not found, then
>>>> that class shows up in the mesos slave error logs. Can someone confirm if
>>>> this is a valid cause for the problem I am seeing? Any way I can debug
>>>> this
>>>> further?
>>>> 
>>>> — Bharath
>>> 
>>> --
>>> Cheers,
>>> Tim
>>> Freedom, Features, Friends, First -> Fedora
>>> https://fedoraproject.org/wiki/SIGs/bigdata
>> 
>> 
> 
> -- 
> Cheers,
> Tim
> Freedom, Features, Friends, First -> Fedora
> https://fedoraproject.org/wiki/SIGs/bigdata



Re: java.lang.ClassNotFoundException - spark on mesos

2014-03-31 Thread Tim St Clair
It sounds like the protobuf issue. 

So FWIW, You might want to try updating the 0.9.0 w/pom mods for mesos & 
protobuf. 

mesos 0.17.0 & protobuf 2.5   

Cheers,
Tim

- Original Message -
> From: "Bharath Bhushan" 
> To: user@spark.apache.org
> Sent: Monday, March 31, 2014 9:46:32 AM
> Subject: Re: java.lang.ClassNotFoundException - spark on mesos
> 
> I tried 0.9.0 and the latest git tree of spark. For mesos, I tried 0.17.0 and
> the latest git tree.
> 
> Thanks
> 
> 
> On 31-Mar-2014, at 7:24 pm, Tim St Clair  wrote:
> 
> > What versions are you running?
> > 
> > There is a known protobuf 2.5 mismatch, depending on your versions.
> > 
> > Cheers,
> > Tim
> > 
> > - Original Message -
> >> From: "Bharath Bhushan" 
> >> To: user@spark.apache.org
> >> Sent: Monday, March 31, 2014 8:16:19 AM
> >> Subject: java.lang.ClassNotFoundException - spark on mesos
> >> 
> >> I am facing different kinds of java.lang.ClassNotFoundException when
> >> trying
> >> to run spark on mesos. One error has to do with
> >> org.apache.spark.executor.MesosExecutorBackend. Another has to do with
> >> org.apache.spark.serializer.JavaSerializer. I see other people complaining
> >> about similar issues.
> >> 
> >> I tried with different version of spark distribution - 0.9.0 and
> >> 1.0.0-SNAPSHOT and faced the same problem. I think the reason for this is
> >> is
> >> related to the error below.
> >> 
> >> $ jar -xf spark-assembly_2.10-0.9.0-incubating-hadoop2.2.0.jar
> >> java.io.IOException: META-INF/license : could not create directory
> >>at sun.tools.jar.Main.extractFile(Main.java:907)
> >>at sun.tools.jar.Main.extract(Main.java:850)
> >>at sun.tools.jar.Main.run(Main.java:240)
> >>at sun.tools.jar.Main.main(Main.java:1147)
> >> 
> >> This error happens with all the jars that I created. But the classes that
> >> are
> >> already generated is different in the different cases. If JavaSerializer
> >> is
> >> not already extracted before encountering META-INF/license, then that
> >> class
> >> is not found during execution. If MesosExecutorBackend is not found, then
> >> that class shows up in the mesos slave error logs. Can someone confirm if
> >> this is a valid cause for the problem I am seeing? Any way I can debug
> >> this
> >> further?
> >> 
> >> — Bharath
> > 
> > --
> > Cheers,
> > Tim
> > Freedom, Features, Friends, First -> Fedora
> > https://fedoraproject.org/wiki/SIGs/bigdata
> 
> 

-- 
Cheers,
Tim
Freedom, Features, Friends, First -> Fedora
https://fedoraproject.org/wiki/SIGs/bigdata


Re: java.lang.ClassNotFoundException - spark on mesos

2014-03-31 Thread Bharath Bhushan
I tried 0.9.0 and the latest git tree of spark. For mesos, I tried 0.17.0 and 
the latest git tree.

Thanks


On 31-Mar-2014, at 7:24 pm, Tim St Clair  wrote:

> What versions are you running?  
> 
> There is a known protobuf 2.5 mismatch, depending on your versions. 
> 
> Cheers,
> Tim
> 
> - Original Message -
>> From: "Bharath Bhushan" 
>> To: user@spark.apache.org
>> Sent: Monday, March 31, 2014 8:16:19 AM
>> Subject: java.lang.ClassNotFoundException - spark on mesos
>> 
>> I am facing different kinds of java.lang.ClassNotFoundException when trying
>> to run spark on mesos. One error has to do with
>> org.apache.spark.executor.MesosExecutorBackend. Another has to do with
>> org.apache.spark.serializer.JavaSerializer. I see other people complaining
>> about similar issues.
>> 
>> I tried with different version of spark distribution - 0.9.0 and
>> 1.0.0-SNAPSHOT and faced the same problem. I think the reason for this is is
>> related to the error below.
>> 
>> $ jar -xf spark-assembly_2.10-0.9.0-incubating-hadoop2.2.0.jar
>> java.io.IOException: META-INF/license : could not create directory
>>at sun.tools.jar.Main.extractFile(Main.java:907)
>>at sun.tools.jar.Main.extract(Main.java:850)
>>at sun.tools.jar.Main.run(Main.java:240)
>>at sun.tools.jar.Main.main(Main.java:1147)
>> 
>> This error happens with all the jars that I created. But the classes that are
>> already generated is different in the different cases. If JavaSerializer is
>> not already extracted before encountering META-INF/license, then that class
>> is not found during execution. If MesosExecutorBackend is not found, then
>> that class shows up in the mesos slave error logs. Can someone confirm if
>> this is a valid cause for the problem I am seeing? Any way I can debug this
>> further?
>> 
>> — Bharath
> 
> -- 
> Cheers,
> Tim
> Freedom, Features, Friends, First -> Fedora
> https://fedoraproject.org/wiki/SIGs/bigdata



Re: java.lang.ClassNotFoundException - spark on mesos

2014-03-31 Thread Tim St Clair
What versions are you running?  

There is a known protobuf 2.5 mismatch, depending on your versions. 

Cheers,
Tim

- Original Message -
> From: "Bharath Bhushan" 
> To: user@spark.apache.org
> Sent: Monday, March 31, 2014 8:16:19 AM
> Subject: java.lang.ClassNotFoundException - spark on mesos
> 
> I am facing different kinds of java.lang.ClassNotFoundException when trying
> to run spark on mesos. One error has to do with
> org.apache.spark.executor.MesosExecutorBackend. Another has to do with
> org.apache.spark.serializer.JavaSerializer. I see other people complaining
> about similar issues.
> 
> I tried with different version of spark distribution - 0.9.0 and
> 1.0.0-SNAPSHOT and faced the same problem. I think the reason for this is is
> related to the error below.
> 
> $ jar -xf spark-assembly_2.10-0.9.0-incubating-hadoop2.2.0.jar
> java.io.IOException: META-INF/license : could not create directory
> at sun.tools.jar.Main.extractFile(Main.java:907)
> at sun.tools.jar.Main.extract(Main.java:850)
> at sun.tools.jar.Main.run(Main.java:240)
> at sun.tools.jar.Main.main(Main.java:1147)
> 
> This error happens with all the jars that I created. But the classes that are
> already generated is different in the different cases. If JavaSerializer is
> not already extracted before encountering META-INF/license, then that class
> is not found during execution. If MesosExecutorBackend is not found, then
> that class shows up in the mesos slave error logs. Can someone confirm if
> this is a valid cause for the problem I am seeing? Any way I can debug this
> further?
> 
> — Bharath

-- 
Cheers,
Tim
Freedom, Features, Friends, First -> Fedora
https://fedoraproject.org/wiki/SIGs/bigdata


java.lang.ClassNotFoundException - spark on mesos

2014-03-31 Thread Bharath Bhushan
I am facing different kinds of java.lang.ClassNotFoundException when trying to 
run spark on mesos. One error has to do with 
org.apache.spark.executor.MesosExecutorBackend. Another has to do with 
org.apache.spark.serializer.JavaSerializer. I see other people complaining 
about similar issues.

I tried with different version of spark distribution - 0.9.0 and 1.0.0-SNAPSHOT 
and faced the same problem. I think the reason for this is is related to the 
error below.

$ jar -xf spark-assembly_2.10-0.9.0-incubating-hadoop2.2.0.jar
java.io.IOException: META-INF/license : could not create directory
at sun.tools.jar.Main.extractFile(Main.java:907)
at sun.tools.jar.Main.extract(Main.java:850)
at sun.tools.jar.Main.run(Main.java:240)
at sun.tools.jar.Main.main(Main.java:1147)

This error happens with all the jars that I created. But the classes that are 
already generated is different in the different cases. If JavaSerializer is not 
already extracted before encountering META-INF/license, then that class is not 
found during execution. If MesosExecutorBackend is not found, then that class 
shows up in the mesos slave error logs. Can someone confirm if this is a valid 
cause for the problem I am seeing? Any way I can debug this further?

— Bharath

Re: java.lang.ClassNotFoundException

2014-03-26 Thread Jaonary Rabarisoa
The issue and a workaround can be found here
https://github.com/apache/spark/pull/181


On Wed, Mar 26, 2014 at 10:12 PM, Aniket Mokashi wrote:

> context.objectFile[ReIdDataSetEntry]("data") -not sure how this is
> compiled in scala. But, if it uses some sort of ObjectInputStream, you need
> to be careful - ObjectInputStream uses root classloader to load classes and
> does not work with jars that are added to TCCC. Apache commons has
> ClassLoaderObjectInputStream to workaround this.
>
>
> On Wed, Mar 26, 2014 at 1:38 PM, Jaonary Rabarisoa wrote:
>
>> it seems to be an old problem :
>>
>>
>> http://mail-archives.apache.org/mod_mbox/spark-user/201311.mbox/%3c7f6aa9e820f55d4a96946a87e086ef4a4bcdf...@eagh-erfpmbx41.erf.thomson.com%3E
>>
>> https://groups.google.com/forum/#!topic/spark-users/Q66UOeA2u-I
>>
>> Does anyone got the solution ?
>>
>>
>> On Wed, Mar 26, 2014 at 5:50 PM, Yana Kadiyska 
>> wrote:
>>
>>> I might be way off here but are you looking at the logs on the worker
>>> machines? I am running an older version (0.8) and when I look at the
>>> error log for the executor process I see the exact location where the
>>> executor process tries to load the jar from...with a line like this:
>>>
>>> 14/03/26 13:57:11 INFO executor.Executor: Adding
>>> file:/dirs/dirs/spark/work/app-20140326135710-0029/0/./spark-test.jar
>>> to class loader
>>>
>>> You said "The jar file is present in each node", do you see any
>>> information on the executor indicating that it's trying to load the
>>> jar or where it's loading it from? I can't tell for sure by looking at
>>> your logs but they seem to be logs from the master and driver, not
>>> from the executor itself?
>>>
>>> On Wed, Mar 26, 2014 at 11:46 AM, Ognen Duzlevski
>>>  wrote:
>>> > Have you looked through the logs fully? I have seen this (in my limited
>>> > experience) pop up as a result of previous exceptions/errors, also as a
>>> > result of being unable to serialize objects etc.
>>> > Ognen
>>> >
>>> >
>>> > On 3/26/14, 10:39 AM, Jaonary Rabarisoa wrote:
>>> >
>>> > I notice that I get this error when I'm trying to load an objectFile
>>> with
>>> > val viperReloaded = context.objectFile[ReIdDataSetEntry]("data")
>>> >
>>> >
>>> > On Wed, Mar 26, 2014 at 3:58 PM, Jaonary Rabarisoa 
>>> > wrote:
>>> >>
>>> >> Here the output that I get :
>>> >>
>>> >> [error] (run-main-0) org.apache.spark.SparkException: Job aborted:
>>> Task
>>> >> 1.0:1 failed 4 times (most recent failure: Exception failure in TID 6
>>> on
>>> >> host 172.166.86.36: java.lang.ClassNotFoundException:
>>> >> value.models.ReIdDataSetEntry)
>>> >> org.apache.spark.SparkException: Job aborted: Task 1.0:1 failed 4
>>> times
>>> >> (most recent failure: Exception failure in TID 6 on host
>>> 172.166.86.36:
>>> >> java.lang.ClassNotFoundException: value.models.ReIdDataSetEntry)
>>> >> at
>>> >>
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1011)
>>> >> at
>>> >>
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1009)
>>> >> at
>>> >>
>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>> >> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>> >> at
>>> >> org.apache.spark.scheduler.DAGScheduler.org
>>> $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1009)
>>> >> at
>>> >>
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
>>> >> at
>>> >>
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
>>> >> at scala.Option.foreach(Option.scala:236)
>>> >> at
>>> >>
>>> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:596)
>>> >> at
>>> >>
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:1

Re: java.lang.ClassNotFoundException

2014-03-26 Thread Aniket Mokashi
context.objectFile[ReIdDataSetEntry]("data") -not sure how this is compiled
in scala. But, if it uses some sort of ObjectInputStream, you need to be
careful - ObjectInputStream uses root classloader to load classes and does
not work with jars that are added to TCCC. Apache commons has
ClassLoaderObjectInputStream to workaround this.


On Wed, Mar 26, 2014 at 1:38 PM, Jaonary Rabarisoa wrote:

> it seems to be an old problem :
>
>
> http://mail-archives.apache.org/mod_mbox/spark-user/201311.mbox/%3c7f6aa9e820f55d4a96946a87e086ef4a4bcdf...@eagh-erfpmbx41.erf.thomson.com%3E
>
> https://groups.google.com/forum/#!topic/spark-users/Q66UOeA2u-I
>
> Does anyone got the solution ?
>
>
> On Wed, Mar 26, 2014 at 5:50 PM, Yana Kadiyska wrote:
>
>> I might be way off here but are you looking at the logs on the worker
>> machines? I am running an older version (0.8) and when I look at the
>> error log for the executor process I see the exact location where the
>> executor process tries to load the jar from...with a line like this:
>>
>> 14/03/26 13:57:11 INFO executor.Executor: Adding
>> file:/dirs/dirs/spark/work/app-20140326135710-0029/0/./spark-test.jar
>> to class loader
>>
>> You said "The jar file is present in each node", do you see any
>> information on the executor indicating that it's trying to load the
>> jar or where it's loading it from? I can't tell for sure by looking at
>> your logs but they seem to be logs from the master and driver, not
>> from the executor itself?
>>
>> On Wed, Mar 26, 2014 at 11:46 AM, Ognen Duzlevski
>>  wrote:
>> > Have you looked through the logs fully? I have seen this (in my limited
>> > experience) pop up as a result of previous exceptions/errors, also as a
>> > result of being unable to serialize objects etc.
>> > Ognen
>> >
>> >
>> > On 3/26/14, 10:39 AM, Jaonary Rabarisoa wrote:
>> >
>> > I notice that I get this error when I'm trying to load an objectFile
>> with
>> > val viperReloaded = context.objectFile[ReIdDataSetEntry]("data")
>> >
>> >
>> > On Wed, Mar 26, 2014 at 3:58 PM, Jaonary Rabarisoa 
>> > wrote:
>> >>
>> >> Here the output that I get :
>> >>
>> >> [error] (run-main-0) org.apache.spark.SparkException: Job aborted: Task
>> >> 1.0:1 failed 4 times (most recent failure: Exception failure in TID 6
>> on
>> >> host 172.166.86.36: java.lang.ClassNotFoundException:
>> >> value.models.ReIdDataSetEntry)
>> >> org.apache.spark.SparkException: Job aborted: Task 1.0:1 failed 4 times
>> >> (most recent failure: Exception failure in TID 6 on host 172.166.86.36
>> :
>> >> java.lang.ClassNotFoundException: value.models.ReIdDataSetEntry)
>> >> at
>> >>
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1011)
>> >> at
>> >>
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1009)
>> >> at
>> >>
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>> >> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>> >> at
>> >> org.apache.spark.scheduler.DAGScheduler.org
>> $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1009)
>> >> at
>> >>
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
>> >> at
>> >>
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
>> >> at scala.Option.foreach(Option.scala:236)
>> >> at
>> >>
>> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:596)
>> >> at
>> >>
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:146)
>> >> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>> >> at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>> >> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>> >> at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>> >> at
>> >>
>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>> >> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> >> at
&

Re: java.lang.ClassNotFoundException

2014-03-26 Thread Jaonary Rabarisoa
it seems to be an old problem :

http://mail-archives.apache.org/mod_mbox/spark-user/201311.mbox/%3c7f6aa9e820f55d4a96946a87e086ef4a4bcdf...@eagh-erfpmbx41.erf.thomson.com%3E

https://groups.google.com/forum/#!topic/spark-users/Q66UOeA2u-I

Does anyone got the solution ?


On Wed, Mar 26, 2014 at 5:50 PM, Yana Kadiyska wrote:

> I might be way off here but are you looking at the logs on the worker
> machines? I am running an older version (0.8) and when I look at the
> error log for the executor process I see the exact location where the
> executor process tries to load the jar from...with a line like this:
>
> 14/03/26 13:57:11 INFO executor.Executor: Adding
> file:/dirs/dirs/spark/work/app-20140326135710-0029/0/./spark-test.jar
> to class loader
>
> You said "The jar file is present in each node", do you see any
> information on the executor indicating that it's trying to load the
> jar or where it's loading it from? I can't tell for sure by looking at
> your logs but they seem to be logs from the master and driver, not
> from the executor itself?
>
> On Wed, Mar 26, 2014 at 11:46 AM, Ognen Duzlevski
>  wrote:
> > Have you looked through the logs fully? I have seen this (in my limited
> > experience) pop up as a result of previous exceptions/errors, also as a
> > result of being unable to serialize objects etc.
> > Ognen
> >
> >
> > On 3/26/14, 10:39 AM, Jaonary Rabarisoa wrote:
> >
> > I notice that I get this error when I'm trying to load an objectFile with
> > val viperReloaded = context.objectFile[ReIdDataSetEntry]("data")
> >
> >
> > On Wed, Mar 26, 2014 at 3:58 PM, Jaonary Rabarisoa 
> > wrote:
> >>
> >> Here the output that I get :
> >>
> >> [error] (run-main-0) org.apache.spark.SparkException: Job aborted: Task
> >> 1.0:1 failed 4 times (most recent failure: Exception failure in TID 6 on
> >> host 172.166.86.36: java.lang.ClassNotFoundException:
> >> value.models.ReIdDataSetEntry)
> >> org.apache.spark.SparkException: Job aborted: Task 1.0:1 failed 4 times
> >> (most recent failure: Exception failure in TID 6 on host 172.166.86.36:
> >> java.lang.ClassNotFoundException: value.models.ReIdDataSetEntry)
> >> at
> >>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1011)
> >> at
> >>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1009)
> >> at
> >>
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> >> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> >> at
> >> org.apache.spark.scheduler.DAGScheduler.org
> $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1009)
> >> at
> >>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
> >> at
> >>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
> >> at scala.Option.foreach(Option.scala:236)
> >> at
> >>
> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:596)
> >> at
> >>
> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:146)
> >> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
> >> at akka.actor.ActorCell.invoke(ActorCell.scala:456)
> >> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
> >> at akka.dispatch.Mailbox.run(Mailbox.scala:219)
> >> at
> >>
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
> >> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> >> at
> >>
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> >> at
> >> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> >> at
> >>
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> >>
> >> Spark says that the jar is added :
> >>
> >> 14/03/26 15:49:18 INFO SparkContext: Added JAR
> >> target/scala-2.10/value-spark_2.10-1.0.jar
> >>
> >>
> >>
> >>
> >>
> >> On Wed, Mar 26, 2014 at 3:34 PM, Ognen Duzlevski
> >>  wrote:
> >>>
> >>> Have you looked at the individual nodes logs? Can you post a bit more
> of
> >>> the exception's output?
> >>>
> >>>
> >>> On 3/26/14, 8:42 AM, Jaonary Rabarisoa wrote:
> >>>>
> >>>> Hi all,
> >>>>
> >>>> I got java.lang.ClassNotFoundException even with "addJar" called. The
> >>>> jar file is present in each node.
> >>>>
> >>>> I use the version of spark from github master.
> >>>>
> >>>> Any ideas ?
> >>>>
> >>>>
> >>>> Jaonary
> >
> >
>


Re: java.lang.ClassNotFoundException

2014-03-26 Thread Yana Kadiyska
I might be way off here but are you looking at the logs on the worker
machines? I am running an older version (0.8) and when I look at the
error log for the executor process I see the exact location where the
executor process tries to load the jar from...with a line like this:

14/03/26 13:57:11 INFO executor.Executor: Adding
file:/dirs/dirs/spark/work/app-20140326135710-0029/0/./spark-test.jar
to class loader

You said "The jar file is present in each node", do you see any
information on the executor indicating that it's trying to load the
jar or where it's loading it from? I can't tell for sure by looking at
your logs but they seem to be logs from the master and driver, not
from the executor itself?

On Wed, Mar 26, 2014 at 11:46 AM, Ognen Duzlevski
 wrote:
> Have you looked through the logs fully? I have seen this (in my limited
> experience) pop up as a result of previous exceptions/errors, also as a
> result of being unable to serialize objects etc.
> Ognen
>
>
> On 3/26/14, 10:39 AM, Jaonary Rabarisoa wrote:
>
> I notice that I get this error when I'm trying to load an objectFile with
> val viperReloaded = context.objectFile[ReIdDataSetEntry]("data")
>
>
> On Wed, Mar 26, 2014 at 3:58 PM, Jaonary Rabarisoa 
> wrote:
>>
>> Here the output that I get :
>>
>> [error] (run-main-0) org.apache.spark.SparkException: Job aborted: Task
>> 1.0:1 failed 4 times (most recent failure: Exception failure in TID 6 on
>> host 172.166.86.36: java.lang.ClassNotFoundException:
>> value.models.ReIdDataSetEntry)
>> org.apache.spark.SparkException: Job aborted: Task 1.0:1 failed 4 times
>> (most recent failure: Exception failure in TID 6 on host 172.166.86.36:
>> java.lang.ClassNotFoundException: value.models.ReIdDataSetEntry)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1011)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1009)
>> at
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>> at
>> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1009)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
>> at scala.Option.foreach(Option.scala:236)
>> at
>> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:596)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:146)
>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>> at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>> at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>> at
>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> at
>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>> at
>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>> at
>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>
>> Spark says that the jar is added :
>>
>> 14/03/26 15:49:18 INFO SparkContext: Added JAR
>> target/scala-2.10/value-spark_2.10-1.0.jar
>>
>>
>>
>>
>>
>> On Wed, Mar 26, 2014 at 3:34 PM, Ognen Duzlevski
>>  wrote:
>>>
>>> Have you looked at the individual nodes logs? Can you post a bit more of
>>> the exception's output?
>>>
>>>
>>> On 3/26/14, 8:42 AM, Jaonary Rabarisoa wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I got java.lang.ClassNotFoundException even with "addJar" called. The
>>>> jar file is present in each node.
>>>>
>>>> I use the version of spark from github master.
>>>>
>>>> Any ideas ?
>>>>
>>>>
>>>> Jaonary
>
>


Re: java.lang.ClassNotFoundException

2014-03-26 Thread Jaonary Rabarisoa
In fact, It may be related to object serialization :

14/03/26 17:02:19 INFO TaskSetManager: Serialized task 3.0:1 as 2025 bytes
in 1 ms
14/03/26 17:02:19 WARN TaskSetManager: Lost TID 6 (task 3.0:0)
14/03/26 17:02:19 INFO TaskSetManager: Loss was due to
java.lang.ClassNotFoundException: value.models.ReIdDataSetEntry [duplicate
3]
14/03/26 17:02:19 INFO TaskSetManager: Starting task 3.0:0 as TID 8 on
executor 0: 132.166.86.13 (PROCESS_LOCAL)


In this case, What should I do ?


On Wed, Mar 26, 2014 at 4:46 PM, Ognen Duzlevski <
og...@plainvanillagames.com> wrote:

>  Have you looked through the logs fully? I have seen this (in my limited
> experience) pop up as a result of previous exceptions/errors, also as a
> result of being unable to serialize objects etc.
> Ognen
>
>
> On 3/26/14, 10:39 AM, Jaonary Rabarisoa wrote:
>
> I notice that I get this error when I'm trying to load an objectFile with  val
> viperReloaded = context.objectFile[ReIdDataSetEntry]("data")
>
>
> On Wed, Mar 26, 2014 at 3:58 PM, Jaonary Rabarisoa wrote:
>
>>  Here the output that I get :
>>
>>  [error] (run-main-0) org.apache.spark.SparkException: Job aborted: Task
>> 1.0:1 failed 4 times (most recent failure: Exception failure in TID 6 on
>> host 172.166.86.36: java.lang.ClassNotFoundException:
>> value.models.ReIdDataSetEntry)
>> org.apache.spark.SparkException: Job aborted: Task 1.0:1 failed 4 times
>> (most recent failure: Exception failure in TID 6 on host 172.166.86.36:
>> java.lang.ClassNotFoundException: value.models.ReIdDataSetEntry)
>>  at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1011)
>>  at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1009)
>>  at
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>  at org.apache.spark.scheduler.DAGScheduler.org
>> $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1009)
>>  at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
>>  at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
>>  at scala.Option.foreach(Option.scala:236)
>>  at
>> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:596)
>>  at
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:146)
>>  at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>>  at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>>  at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>>  at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>>  at
>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>>  at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>  at
>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>  at
>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>  at
>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>
>>  Spark says that the jar is added :
>>
>>  14/03/26 15:49:18 INFO SparkContext: Added JAR
>> target/scala-2.10/value-spark_2.10-1.0.jar
>>
>>
>>
>>
>>
>> On Wed, Mar 26, 2014 at 3:34 PM, Ognen Duzlevski <
>> og...@plainvanillagames.com> wrote:
>>
>>> Have you looked at the individual nodes logs? Can you post a bit more of
>>> the exception's output?
>>>
>>>
>>> On 3/26/14, 8:42 AM, Jaonary Rabarisoa wrote:
>>>
>>>> Hi all,
>>>>
>>>> I got java.lang.ClassNotFoundException even with "addJar" called. The
>>>> jar file is present in each node.
>>>>
>>>> I use the version of spark from github master.
>>>>
>>>> Any ideas ?
>>>>
>>>>
>>>> Jaonary
>>>
>>>
>


Re: java.lang.ClassNotFoundException

2014-03-26 Thread Ognen Duzlevski
Have you looked through the logs fully? I have seen this (in my limited 
experience) pop up as a result of previous exceptions/errors, also as a 
result of being unable to serialize objects etc.

Ognen

On 3/26/14, 10:39 AM, Jaonary Rabarisoa wrote:
I notice that I get this error when I'm trying to load an objectFile 
with val viperReloaded = context.objectFile[ReIdDataSetEntry]("data")



On Wed, Mar 26, 2014 at 3:58 PM, Jaonary Rabarisoa <mailto:jaon...@gmail.com>> wrote:


Here the output that I get :

[error] (run-main-0) org.apache.spark.SparkException: Job aborted:
Task 1.0:1 failed 4 times (most recent failure: Exception failure
in TID 6 on host 172.166.86.36 <http://172.166.86.36>:
java.lang.ClassNotFoundException: value.models.ReIdDataSetEntry)
org.apache.spark.SparkException: Job aborted: Task 1.0:1 failed 4
times (most recent failure: Exception failure in TID 6 on host
172.166.86.36 <http://172.166.86.36>:
java.lang.ClassNotFoundException: value.models.ReIdDataSetEntry)
at

org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1011)
at

org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1009)
at

scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.org

<http://org.apache.spark.scheduler.DAGScheduler.org>$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1009)
at

org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
at

org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
at scala.Option.foreach(Option.scala:236)
at
org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:596)
at

org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:146)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at

akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at

scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at

scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Spark says that the jar is added :

14/03/26 15:49:18 INFO SparkContext: Added JAR
target/scala-2.10/value-spark_2.10-1.0.jar





On Wed, Mar 26, 2014 at 3:34 PM, Ognen Duzlevski
mailto:og...@plainvanillagames.com>>
wrote:

Have you looked at the individual nodes logs? Can you post a
bit more of the exception's output?


On 3/26/14, 8:42 AM, Jaonary Rabarisoa wrote:

Hi all,

I got java.lang.ClassNotFoundException even with "addJar"
called. The jar file is present in each node.

I use the version of spark from github master.

Any ideas ?


Jaonary





Re: java.lang.ClassNotFoundException

2014-03-26 Thread Jaonary Rabarisoa
I notice that I get this error when I'm trying to load an objectFile with  val
viperReloaded = context.objectFile[ReIdDataSetEntry]("data")


On Wed, Mar 26, 2014 at 3:58 PM, Jaonary Rabarisoa wrote:

> Here the output that I get :
>
> [error] (run-main-0) org.apache.spark.SparkException: Job aborted: Task
> 1.0:1 failed 4 times (most recent failure: Exception failure in TID 6 on
> host 172.166.86.36: java.lang.ClassNotFoundException:
> value.models.ReIdDataSetEntry)
> org.apache.spark.SparkException: Job aborted: Task 1.0:1 failed 4 times
> (most recent failure: Exception failure in TID 6 on host 172.166.86.36:
> java.lang.ClassNotFoundException: value.models.ReIdDataSetEntry)
>  at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1011)
>  at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1009)
>  at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>  at org.apache.spark.scheduler.DAGScheduler.org
> $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1009)
>  at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
> at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
>  at scala.Option.foreach(Option.scala:236)
> at
> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:596)
>  at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:146)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>  at akka.actor.ActorCell.invoke(ActorCell.scala:456)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>  at akka.dispatch.Mailbox.run(Mailbox.scala:219)
> at
> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>  at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at
> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>  at
> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at
> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>
> Spark says that the jar is added :
>
> 14/03/26 15:49:18 INFO SparkContext: Added JAR
> target/scala-2.10/value-spark_2.10-1.0.jar
>
>
>
>
>
> On Wed, Mar 26, 2014 at 3:34 PM, Ognen Duzlevski <
> og...@plainvanillagames.com> wrote:
>
>> Have you looked at the individual nodes logs? Can you post a bit more of
>> the exception's output?
>>
>>
>> On 3/26/14, 8:42 AM, Jaonary Rabarisoa wrote:
>>
>>> Hi all,
>>>
>>> I got java.lang.ClassNotFoundException even with "addJar" called. The
>>> jar file is present in each node.
>>>
>>> I use the version of spark from github master.
>>>
>>> Any ideas ?
>>>
>>>
>>> Jaonary
>>>
>>
>


Re: java.lang.ClassNotFoundException

2014-03-26 Thread Jaonary Rabarisoa
Here the output that I get :

[error] (run-main-0) org.apache.spark.SparkException: Job aborted: Task
1.0:1 failed 4 times (most recent failure: Exception failure in TID 6 on
host 172.166.86.36: java.lang.ClassNotFoundException:
value.models.ReIdDataSetEntry)
org.apache.spark.SparkException: Job aborted: Task 1.0:1 failed 4 times
(most recent failure: Exception failure in TID 6 on host 172.166.86.36:
java.lang.ClassNotFoundException: value.models.ReIdDataSetEntry)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1011)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1009)
at
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.org
$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1009)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:596)
at scala.Option.foreach(Option.scala:236)
at
org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:596)
at
org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:146)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)

Spark says that the jar is added :

14/03/26 15:49:18 INFO SparkContext: Added JAR
target/scala-2.10/value-spark_2.10-1.0.jar





On Wed, Mar 26, 2014 at 3:34 PM, Ognen Duzlevski <
og...@plainvanillagames.com> wrote:

> Have you looked at the individual nodes logs? Can you post a bit more of
> the exception's output?
>
>
> On 3/26/14, 8:42 AM, Jaonary Rabarisoa wrote:
>
>> Hi all,
>>
>> I got java.lang.ClassNotFoundException even with "addJar" called. The
>> jar file is present in each node.
>>
>> I use the version of spark from github master.
>>
>> Any ideas ?
>>
>>
>> Jaonary
>>
>


Re: java.lang.ClassNotFoundException

2014-03-26 Thread Ognen Duzlevski
Have you looked at the individual nodes logs? Can you post a bit more of 
the exception's output?


On 3/26/14, 8:42 AM, Jaonary Rabarisoa wrote:

Hi all,

I got java.lang.ClassNotFoundException even with "addJar" called. The 
jar file is present in each node.


I use the version of spark from github master.

Any ideas ?


Jaonary


java.lang.ClassNotFoundException

2014-03-26 Thread Jaonary Rabarisoa
Hi all,

I got java.lang.ClassNotFoundException even with "addJar" called. The jar
file is present in each node.

I use the version of spark from github master.

Any ideas ?


Jaonary


  1   2   >