subject:"Re\: Spark on YARN"

Re: Spark-on-Yarn ClassNotFound Exception

2022-12-18 Thread Hariharan

Hi scrypso,

Sorry for the late reply. Yes, I did mean spark.driver.extraClassPath. I
was able to work around this issue by removing the need for an extra class,
but I'll investigate along these lines nonetheless.

Thanks again for all your help!

On Thu, Dec 15, 2022 at 9:56 PM scrypso  wrote:

> Hmm, did you mean spark.*driver*.extraClassPath? That is very odd then -
> if you check the logs directory for the driver (on the cluster) I think
> there should be a launch container log, where you can see the exact command
> used to start the JVM (at the very end), and a line starting "export
> CLASSPATH" - you can double check that your jar looks to be included
> correctly there. If it is I think you have a really "interesting" issue on
> your hands!
>
> - scrypso
>
> On Wed, Dec 14, 2022, 05:17 Hariharan  wrote:
>
>> Hi scrypso,
>>
>> Thanks for the help so far, and I think you're definitely on to something
>> here. I tried loading the class as you suggested with the code below:
>>
>> try {
>> 
>> Thread.currentThread().getContextClassLoader().loadClass(MyS3ClientFactory.class.getCanonicalName());
>> logger.info("Loaded custom class");
>> } catch (ClassNotFoundException e) {
>> logger.error("Unable to load class", e);
>> }
>> return spark.read().option("mode", 
>> "DROPMALFORMED").format("avro").load();
>>
>> I am able to load the custom class as above
>> *2022-12-14 04:12:34,158 INFO  [Driver] utils.S3Reader - Loaded custom
>> class*
>>
>> But the spark.read code below it tries to initialize the s3 client and is
>> not able to load the same class.
>>
>> I tried adding
>> *--conf spark.executor.extraClassPath=myjar*
>>
>> as well, but no luck :-(
>>
>> Thanks again!
>>
>> On Tue, Dec 13, 2022 at 10:09 PM scrypso  wrote:
>>
>>> I'm on my phone, so can't compare with the Spark source, but that looks
>>> to me like it should be well after the ctx loader has been set. You could
>>> try printing the classpath of the loader
>>> Thread.currentThread().getThreadContextClassLoader(), or try to load your
>>> class from that yourself to see if you get the same error.
>>>
>>> Can you see which thread is throwing the exception? If it is a different
>>> thread than the "main" application thread it might not have the thread ctx
>>> loader set correctly. I can't see any of your classes in the stacktrace - I
>>> assume that is because of your scrubbing, but it could also be because this
>>> is run in separate thread without ctx loader set.
>>>
>>> It also looks like Hadoop is caching the FileSystems somehow - perhaps
>>> you can create the S3A filesystem yourself and hope it picks that up? (Wild
>>> guess, no idea if that works or how hard it would be.)
>>>
>>>
>>> On Tue, Dec 13, 2022, 17:29 Hariharan  wrote:
>>>
 Thanks for the response, scrypso! I will try adding the extraClassPath
 option. Meanwhile, please find the full stack trace below (I have
 masked/removed references to proprietary code)

 java.lang.RuntimeException: java.lang.RuntimeException:
 java.lang.ClassNotFoundException: Class foo.bar.MyS3ClientFactory not found
 at
 org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2720)
 at
 org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:888)
 at
 org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:542)
 at
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
 at
 org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
 at
 org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
 at
 org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
 at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
 at
 org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$checkAndGlobPathIfNecessary$1(DataSource.scala:752)
 at scala.collection.immutable.List.map(List.scala:293)
 at
 org.apache.spark.sql.execution.datasources.DataSource$.checkAndGlobPathIfNecessary(DataSource.scala:750)
 at
 org.apache.spark.sql.execution.datasources.DataSource.checkAndGlobPathIfNecessary(DataSource.scala:579)
 at
 org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:408)
 at
 org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
 at
 org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
 at scala.Option.getOrElse(Option.scala:189)
 at
 org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)

 Thanks again!

 On Tue, Dec 13, 2022 at 9:52 PM scrypso  wrote:

> Two ideas you could try:
>
> You can try spark.driver.extraClassPath as well. Spark loads the

Re: Spark-on-Yarn ClassNotFound Exception

2022-12-15 Thread scrypso

Hmm, did you mean spark.*driver*.extraClassPath? That is very odd then - if
you check the logs directory for the driver (on the cluster) I think there
should be a launch container log, where you can see the exact command used
to start the JVM (at the very end), and a line starting "export CLASSPATH"
- you can double check that your jar looks to be included correctly there.
If it is I think you have a really "interesting" issue on your hands!

- scrypso

On Wed, Dec 14, 2022, 05:17 Hariharan  wrote:

> Hi scrypso,
>
> Thanks for the help so far, and I think you're definitely on to something
> here. I tried loading the class as you suggested with the code below:
>
> try {
> 
> Thread.currentThread().getContextClassLoader().loadClass(MyS3ClientFactory.class.getCanonicalName());
> logger.info("Loaded custom class");
> } catch (ClassNotFoundException e) {
> logger.error("Unable to load class", e);
> }
> return spark.read().option("mode", 
> "DROPMALFORMED").format("avro").load();
>
> I am able to load the custom class as above
> *2022-12-14 04:12:34,158 INFO  [Driver] utils.S3Reader - Loaded custom
> class*
>
> But the spark.read code below it tries to initialize the s3 client and is
> not able to load the same class.
>
> I tried adding
> *--conf spark.executor.extraClassPath=myjar*
>
> as well, but no luck :-(
>
> Thanks again!
>
> On Tue, Dec 13, 2022 at 10:09 PM scrypso  wrote:
>
>> I'm on my phone, so can't compare with the Spark source, but that looks
>> to me like it should be well after the ctx loader has been set. You could
>> try printing the classpath of the loader
>> Thread.currentThread().getThreadContextClassLoader(), or try to load your
>> class from that yourself to see if you get the same error.
>>
>> Can you see which thread is throwing the exception? If it is a different
>> thread than the "main" application thread it might not have the thread ctx
>> loader set correctly. I can't see any of your classes in the stacktrace - I
>> assume that is because of your scrubbing, but it could also be because this
>> is run in separate thread without ctx loader set.
>>
>> It also looks like Hadoop is caching the FileSystems somehow - perhaps
>> you can create the S3A filesystem yourself and hope it picks that up? (Wild
>> guess, no idea if that works or how hard it would be.)
>>
>>
>> On Tue, Dec 13, 2022, 17:29 Hariharan  wrote:
>>
>>> Thanks for the response, scrypso! I will try adding the extraClassPath
>>> option. Meanwhile, please find the full stack trace below (I have
>>> masked/removed references to proprietary code)
>>>
>>> java.lang.RuntimeException: java.lang.RuntimeException:
>>> java.lang.ClassNotFoundException: Class foo.bar.MyS3ClientFactory not found
>>> at
>>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2720)
>>> at
>>> org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:888)
>>> at
>>> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:542)
>>> at
>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
>>> at
>>> org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
>>> at
>>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
>>> at
>>> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
>>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
>>> at
>>> org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$checkAndGlobPathIfNecessary$1(DataSource.scala:752)
>>> at scala.collection.immutable.List.map(List.scala:293)
>>> at
>>> org.apache.spark.sql.execution.datasources.DataSource$.checkAndGlobPathIfNecessary(DataSource.scala:750)
>>> at
>>> org.apache.spark.sql.execution.datasources.DataSource.checkAndGlobPathIfNecessary(DataSource.scala:579)
>>> at
>>> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:408)
>>> at
>>> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
>>> at
>>> org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
>>> at scala.Option.getOrElse(Option.scala:189)
>>> at
>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
>>>
>>> Thanks again!
>>>
>>> On Tue, Dec 13, 2022 at 9:52 PM scrypso  wrote:
>>>
 Two ideas you could try:

 You can try spark.driver.extraClassPath as well. Spark loads the user's
 jar in a child classloader, so Spark/Yarn/Hadoop can only see your classes
 reflectively. Hadoop's Configuration should use the thread ctx classloader,
 and Spark should set that to the loader that loads your jar. The
 extraClassPath option just adds jars directly to the Java command that
 creates the driver/executor.

 I can't immediately tell how your error might arise,

Re: Spark-on-Yarn ClassNotFound Exception

2022-12-13 Thread Hariharan

Hi scrypso,

Thanks for the help so far, and I think you're definitely on to something
here. I tried loading the class as you suggested with the code below:

try {

Thread.currentThread().getContextClassLoader().loadClass(MyS3ClientFactory.class.getCanonicalName());
logger.info("Loaded custom class");
} catch (ClassNotFoundException e) {
logger.error("Unable to load class", e);
}
return spark.read().option("mode",
"DROPMALFORMED").format("avro").load();

I am able to load the custom class as above
*2022-12-14 04:12:34,158 INFO  [Driver] utils.S3Reader - Loaded custom
class*

But the spark.read code below it tries to initialize the s3 client and is
not able to load the same class.

I tried adding
*--conf spark.executor.extraClassPath=myjar*

as well, but no luck :-(

Thanks again!

On Tue, Dec 13, 2022 at 10:09 PM scrypso  wrote:

> I'm on my phone, so can't compare with the Spark source, but that looks to
> me like it should be well after the ctx loader has been set. You could try
> printing the classpath of the loader
> Thread.currentThread().getThreadContextClassLoader(), or try to load your
> class from that yourself to see if you get the same error.
>
> Can you see which thread is throwing the exception? If it is a different
> thread than the "main" application thread it might not have the thread ctx
> loader set correctly. I can't see any of your classes in the stacktrace - I
> assume that is because of your scrubbing, but it could also be because this
> is run in separate thread without ctx loader set.
>
> It also looks like Hadoop is caching the FileSystems somehow - perhaps you
> can create the S3A filesystem yourself and hope it picks that up? (Wild
> guess, no idea if that works or how hard it would be.)
>
>
> On Tue, Dec 13, 2022, 17:29 Hariharan  wrote:
>
>> Thanks for the response, scrypso! I will try adding the extraClassPath
>> option. Meanwhile, please find the full stack trace below (I have
>> masked/removed references to proprietary code)
>>
>> java.lang.RuntimeException: java.lang.RuntimeException:
>> java.lang.ClassNotFoundException: Class foo.bar.MyS3ClientFactory not found
>> at
>> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2720)
>> at
>> org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:888)
>> at
>> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:542)
>> at
>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
>> at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
>> at
>> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
>> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
>> at
>> org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$checkAndGlobPathIfNecessary$1(DataSource.scala:752)
>> at scala.collection.immutable.List.map(List.scala:293)
>> at
>> org.apache.spark.sql.execution.datasources.DataSource$.checkAndGlobPathIfNecessary(DataSource.scala:750)
>> at
>> org.apache.spark.sql.execution.datasources.DataSource.checkAndGlobPathIfNecessary(DataSource.scala:579)
>> at
>> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:408)
>> at
>> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
>> at
>> org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
>> at scala.Option.getOrElse(Option.scala:189)
>> at
>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
>>
>> Thanks again!
>>
>> On Tue, Dec 13, 2022 at 9:52 PM scrypso  wrote:
>>
>>> Two ideas you could try:
>>>
>>> You can try spark.driver.extraClassPath as well. Spark loads the user's
>>> jar in a child classloader, so Spark/Yarn/Hadoop can only see your classes
>>> reflectively. Hadoop's Configuration should use the thread ctx classloader,
>>> and Spark should set that to the loader that loads your jar. The
>>> extraClassPath option just adds jars directly to the Java command that
>>> creates the driver/executor.
>>>
>>> I can't immediately tell how your error might arise, unless there is
>>> some timing issue with the Spark and Hadoop setup. Can you share the full
>>> stacktrace of the ClassNotFound exception? That might tell us when Hadoop
>>> is looking up this class.
>>>
>>> Good luck!
>>> - scrypso
>>>
>>>
>>> On Tue, Dec 13, 2022, 17:05 Hariharan  wrote:
>>>
 Missed to mention it above, but just to add, the error is coming from
 the driver. I tried using *--driver-class-path /path/to/my/jar* as
 well, but no luck.

 Thanks!

 On Mon, Dec 12, 2022 at 4:21 PM Hariharan 
 wrote:

> Hello folks,
>
> I have a spark app with a custom implementation of

Re: Spark-on-Yarn ClassNotFound Exception

2022-12-13 Thread scrypso

I'm on my phone, so can't compare with the Spark source, but that looks to
me like it should be well after the ctx loader has been set. You could try
printing the classpath of the loader
Thread.currentThread().getThreadContextClassLoader(), or try to load your
class from that yourself to see if you get the same error.

Can you see which thread is throwing the exception? If it is a different
thread than the "main" application thread it might not have the thread ctx
loader set correctly. I can't see any of your classes in the stacktrace - I
assume that is because of your scrubbing, but it could also be because this
is run in separate thread without ctx loader set.

It also looks like Hadoop is caching the FileSystems somehow - perhaps you
can create the S3A filesystem yourself and hope it picks that up? (Wild
guess, no idea if that works or how hard it would be.)


On Tue, Dec 13, 2022, 17:29 Hariharan  wrote:

> Thanks for the response, scrypso! I will try adding the extraClassPath
> option. Meanwhile, please find the full stack trace below (I have
> masked/removed references to proprietary code)
>
> java.lang.RuntimeException: java.lang.RuntimeException:
> java.lang.ClassNotFoundException: Class foo.bar.MyS3ClientFactory not found
> at
> org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2720)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:888)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:542)
> at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
> at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
> at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
> at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
> at
> org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$checkAndGlobPathIfNecessary$1(DataSource.scala:752)
> at scala.collection.immutable.List.map(List.scala:293)
> at
> org.apache.spark.sql.execution.datasources.DataSource$.checkAndGlobPathIfNecessary(DataSource.scala:750)
> at
> org.apache.spark.sql.execution.datasources.DataSource.checkAndGlobPathIfNecessary(DataSource.scala:579)
> at
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:408)
> at
> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
> at
> org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
> at scala.Option.getOrElse(Option.scala:189)
> at
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)
>
> Thanks again!
>
> On Tue, Dec 13, 2022 at 9:52 PM scrypso  wrote:
>
>> Two ideas you could try:
>>
>> You can try spark.driver.extraClassPath as well. Spark loads the user's
>> jar in a child classloader, so Spark/Yarn/Hadoop can only see your classes
>> reflectively. Hadoop's Configuration should use the thread ctx classloader,
>> and Spark should set that to the loader that loads your jar. The
>> extraClassPath option just adds jars directly to the Java command that
>> creates the driver/executor.
>>
>> I can't immediately tell how your error might arise, unless there is some
>> timing issue with the Spark and Hadoop setup. Can you share the full
>> stacktrace of the ClassNotFound exception? That might tell us when Hadoop
>> is looking up this class.
>>
>> Good luck!
>> - scrypso
>>
>>
>> On Tue, Dec 13, 2022, 17:05 Hariharan  wrote:
>>
>>> Missed to mention it above, but just to add, the error is coming from
>>> the driver. I tried using *--driver-class-path /path/to/my/jar* as
>>> well, but no luck.
>>>
>>> Thanks!
>>>
>>> On Mon, Dec 12, 2022 at 4:21 PM Hariharan 
>>> wrote:
>>>
 Hello folks,

 I have a spark app with a custom implementation of
 *fs.s3a.s3.client.factory.impl* which is packaged into the same jar.
 Output of *jar tf*

 *2620 Mon Dec 12 11:23:00 IST 2022 aws/utils/MyS3ClientFactory.class*

 However when I run the my spark app with spark-submit in cluster mode,
 it fails with the following error:

 *java.lang.RuntimeException: java.lang.RuntimeException:
 java.lang.ClassNotFoundException: Class aws.utils.MyS3ClientFactory not
 found*

 I tried:
 1. passing in the jar to the *--jars* option (with the local path)
 2. Passing in the jar to *spark.yarn.jars* option with an HDFS path

 but still the same error.

 Any suggestions on what I'm missing?

 Other pertinent details:
 Spark version: 3.3.0
 Hadoop version: 3.3.4

 Command used to run the app
 */spark/bin/spark-submit --class MyMainClass --deploy-mode cluster
 --master yarn  --conf spark.executor.instances=6

Re: Spark-on-Yarn ClassNotFound Exception

2022-12-13 Thread Hariharan

Thanks for the response, scrypso! I will try adding the extraClassPath
option. Meanwhile, please find the full stack trace below (I have
masked/removed references to proprietary code)

java.lang.RuntimeException: java.lang.RuntimeException:
java.lang.ClassNotFoundException: Class foo.bar.MyS3ClientFactory not found
at
org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2720)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:888)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:542)
at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
at org.apache.hadoop.fs.FileSystem.access$300(FileSystem.java:174)
at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3574)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3521)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:540)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
at
org.apache.spark.sql.execution.datasources.DataSource$.$anonfun$checkAndGlobPathIfNecessary$1(DataSource.scala:752)
at scala.collection.immutable.List.map(List.scala:293)
at
org.apache.spark.sql.execution.datasources.DataSource$.checkAndGlobPathIfNecessary(DataSource.scala:750)
at
org.apache.spark.sql.execution.datasources.DataSource.checkAndGlobPathIfNecessary(DataSource.scala:579)
at
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:408)
at
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228)
at
org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210)
at scala.Option.getOrElse(Option.scala:189)
at
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210)

Thanks again!

On Tue, Dec 13, 2022 at 9:52 PM scrypso  wrote:

> Two ideas you could try:
>
> You can try spark.driver.extraClassPath as well. Spark loads the user's
> jar in a child classloader, so Spark/Yarn/Hadoop can only see your classes
> reflectively. Hadoop's Configuration should use the thread ctx classloader,
> and Spark should set that to the loader that loads your jar. The
> extraClassPath option just adds jars directly to the Java command that
> creates the driver/executor.
>
> I can't immediately tell how your error might arise, unless there is some
> timing issue with the Spark and Hadoop setup. Can you share the full
> stacktrace of the ClassNotFound exception? That might tell us when Hadoop
> is looking up this class.
>
> Good luck!
> - scrypso
>
>
> On Tue, Dec 13, 2022, 17:05 Hariharan  wrote:
>
>> Missed to mention it above, but just to add, the error is coming from the
>> driver. I tried using *--driver-class-path /path/to/my/jar* as well, but
>> no luck.
>>
>> Thanks!
>>
>> On Mon, Dec 12, 2022 at 4:21 PM Hariharan  wrote:
>>
>>> Hello folks,
>>>
>>> I have a spark app with a custom implementation of
>>> *fs.s3a.s3.client.factory.impl* which is packaged into the same jar.
>>> Output of *jar tf*
>>>
>>> *2620 Mon Dec 12 11:23:00 IST 2022 aws/utils/MyS3ClientFactory.class*
>>>
>>> However when I run the my spark app with spark-submit in cluster mode,
>>> it fails with the following error:
>>>
>>> *java.lang.RuntimeException: java.lang.RuntimeException:
>>> java.lang.ClassNotFoundException: Class aws.utils.MyS3ClientFactory not
>>> found*
>>>
>>> I tried:
>>> 1. passing in the jar to the *--jars* option (with the local path)
>>> 2. Passing in the jar to *spark.yarn.jars* option with an HDFS path
>>>
>>> but still the same error.
>>>
>>> Any suggestions on what I'm missing?
>>>
>>> Other pertinent details:
>>> Spark version: 3.3.0
>>> Hadoop version: 3.3.4
>>>
>>> Command used to run the app
>>> */spark/bin/spark-submit --class MyMainClass --deploy-mode cluster
>>> --master yarn  --conf spark.executor.instances=6   /path/to/my/jar*
>>>
>>> TIA!
>>>
>>

Re: Spark-on-Yarn ClassNotFound Exception

2022-12-13 Thread scrypso

Two ideas you could try:

You can try spark.driver.extraClassPath as well. Spark loads the user's jar
in a child classloader, so Spark/Yarn/Hadoop can only see your classes
reflectively. Hadoop's Configuration should use the thread ctx classloader,
and Spark should set that to the loader that loads your jar. The
extraClassPath option just adds jars directly to the Java command that
creates the driver/executor.

I can't immediately tell how your error might arise, unless there is some
timing issue with the Spark and Hadoop setup. Can you share the full
stacktrace of the ClassNotFound exception? That might tell us when Hadoop
is looking up this class.

Good luck!
- scrypso

On Tue, Dec 13, 2022, 17:05 Hariharan  wrote:

> Missed to mention it above, but just to add, the error is coming from the
> driver. I tried using *--driver-class-path /path/to/my/jar* as well, but
> no luck.
>
> Thanks!
>
> On Mon, Dec 12, 2022 at 4:21 PM Hariharan  wrote:
>
>> Hello folks,
>>
>> I have a spark app with a custom implementation of
>> *fs.s3a.s3.client.factory.impl* which is packaged into the same jar.
>> Output of *jar tf*
>>
>> *2620 Mon Dec 12 11:23:00 IST 2022 aws/utils/MyS3ClientFactory.class*
>>
>> However when I run the my spark app with spark-submit in cluster mode, it
>> fails with the following error:
>>
>> *java.lang.RuntimeException: java.lang.RuntimeException:
>> java.lang.ClassNotFoundException: Class aws.utils.MyS3ClientFactory not
>> found*
>>
>> I tried:
>> 1. passing in the jar to the *--jars* option (with the local path)
>> 2. Passing in the jar to *spark.yarn.jars* option with an HDFS path
>>
>> but still the same error.
>>
>> Any suggestions on what I'm missing?
>>
>> Other pertinent details:
>> Spark version: 3.3.0
>> Hadoop version: 3.3.4
>>
>> Command used to run the app
>> */spark/bin/spark-submit --class MyMainClass --deploy-mode cluster
>> --master yarn  --conf spark.executor.instances=6   /path/to/my/jar*
>>
>> TIA!
>>
>

Re: Spark-on-Yarn ClassNotFound Exception

2022-12-13 Thread Hariharan

Missed to mention it above, but just to add, the error is coming from the
driver. I tried using *--driver-class-path /path/to/my/jar* as well, but no
luck.

Thanks!

On Mon, Dec 12, 2022 at 4:21 PM Hariharan  wrote:

> Hello folks,
>
> I have a spark app with a custom implementation of
> *fs.s3a.s3.client.factory.impl* which is packaged into the same jar.
> Output of *jar tf*
>
> *2620 Mon Dec 12 11:23:00 IST 2022 aws/utils/MyS3ClientFactory.class*
>
> However when I run the my spark app with spark-submit in cluster mode, it
> fails with the following error:
>
> *java.lang.RuntimeException: java.lang.RuntimeException:
> java.lang.ClassNotFoundException: Class aws.utils.MyS3ClientFactory not
> found*
>
> I tried:
> 1. passing in the jar to the *--jars* option (with the local path)
> 2. Passing in the jar to *spark.yarn.jars* option with an HDFS path
>
> but still the same error.
>
> Any suggestions on what I'm missing?
>
> Other pertinent details:
> Spark version: 3.3.0
> Hadoop version: 3.3.4
>
> Command used to run the app
> */spark/bin/spark-submit --class MyMainClass --deploy-mode cluster
> --master yarn  --conf spark.executor.instances=6   /path/to/my/jar*
>
> TIA!
>

Re: Spark 3.0 yarn does not support cdh5

2019-10-21 Thread melin li

Many clusters still use cdh5, and want to continue to support cdh5,cdh5
based on hadoop 2.6

melin li  于2019年10月21日周一 下午3:02写道：

> 很多集群还是使用cdh5，希望继续支持cdh5，cdh5是基于hadoop 2.6
>
> dev/make-distribution.sh --tgz -Pkubernetes -Pyarn -Phive-thriftserver
> -Phive -Dhadoop.version=2.6.0-cdh5.15.0 -DskipTest
>
> ```
> [INFO] Compiling 25 Scala sources to
> /Users/libinsong/Documents/codes/tongdun/spark-3.0/resource-managers/yarn/target/scala-2.12/classes
> ...
> [ERROR] [Error]
> /Users/libinsong/Documents/codes/tongdun/spark-3.0/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:298:
> value setRolledLogsIncludePattern is not a member of
> org.apache.hadoop.yarn.api.records.LogAggregationContext
> [ERROR] [Error]
> /Users/libinsong/Documents/codes/tongdun/spark-3.0/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:300:
> value setRolledLogsExcludePattern is not a member of
> org.apache.hadoop.yarn.api.records.LogAggregationContext
> [ERROR] [Error]
> /Users/libinsong/Documents/codes/tongdun/spark-3.0/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:551:
> not found: value isLocalUri
> [ERROR] [Error]
> /Users/libinsong/Documents/codes/tongdun/spark-3.0/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:1367:
> not found: value isLocalUri
> [ERROR] four errors found
> ```
>

Re: [spark on yarn] spark on yarn without DFS

2019-05-23 Thread Achilleus 003

This is interesting. Would really appreciate it if you could share what
exactly did you change in* core-site.xml *and *yarn-site.xml.*

On Wed, May 22, 2019 at 9:14 AM Gourav Sengupta 
wrote:

> just wondering what is the advantage of doing this?
>
> Regards
> Gourav Sengupta
>
> On Wed, May 22, 2019 at 3:01 AM Huizhe Wang 
> wrote:
>
>> Hi Hari,
>> Thanks :) I tried to do it as u said. It works ;)
>>
>>
>> Hariharan 于2019年5月20日 周一下午3:54写道：
>>
>>> Hi Huizhe,
>>>
>>> You can set the "fs.defaultFS" field in core-site.xml to some path on
>>> s3. That way your spark job will use S3 for all operations that need HDFS.
>>> Intermediate data will still be stored on local disk though.
>>>
>>> Thanks,
>>> Hari
>>>
>>> On Mon, May 20, 2019 at 10:14 AM Abdeali Kothari <
>>> abdealikoth...@gmail.com> wrote:
>>>
 While spark can read from S3 directly in EMR, I believe it still needs
 the HDFS to perform shuffles and to write intermediate data into disk when
 doing jobs (I.e. when the in memory need stop spill over to disk)

 For these operations, Spark does need a distributed file system - You
 could use something like EMRFS (which is like a HDFS backed by S3) on
 Amazon.

 The issue could be something else too - so a stacktrace or error
 message could help in understanding the problem.



 On Mon, May 20, 2019, 07:20 Huizhe Wang 
 wrote:

> Hi,
>
> I wanna to use Spark on Yarn without HDFS.I store my resource in AWS
> and using s3a to get them. However, when I use stop-dfs.sh stoped Namenode
> and DataNode. I got an error when using yarn cluster mode. Could I using
> yarn without start DFS, how could I use this mode?
>
> Yours,
> Jane
>

Re: [spark on yarn] spark on yarn without DFS

2019-05-22 Thread Gourav Sengupta

just wondering what is the advantage of doing this?

Regards
Gourav Sengupta

On Wed, May 22, 2019 at 3:01 AM Huizhe Wang  wrote:

> Hi Hari,
> Thanks :) I tried to do it as u said. It works ;)
>
>
> Hariharan 于2019年5月20日 周一下午3:54写道：
>
>> Hi Huizhe,
>>
>> You can set the "fs.defaultFS" field in core-site.xml to some path on s3.
>> That way your spark job will use S3 for all operations that need HDFS.
>> Intermediate data will still be stored on local disk though.
>>
>> Thanks,
>> Hari
>>
>> On Mon, May 20, 2019 at 10:14 AM Abdeali Kothari <
>> abdealikoth...@gmail.com> wrote:
>>
>>> While spark can read from S3 directly in EMR, I believe it still needs
>>> the HDFS to perform shuffles and to write intermediate data into disk when
>>> doing jobs (I.e. when the in memory need stop spill over to disk)
>>>
>>> For these operations, Spark does need a distributed file system - You
>>> could use something like EMRFS (which is like a HDFS backed by S3) on
>>> Amazon.
>>>
>>> The issue could be something else too - so a stacktrace or error message
>>> could help in understanding the problem.
>>>
>>>
>>>
>>> On Mon, May 20, 2019, 07:20 Huizhe Wang  wrote:
>>>
 Hi,

 I wanna to use Spark on Yarn without HDFS.I store my resource in AWS
 and using s3a to get them. However, when I use stop-dfs.sh stoped Namenode
 and DataNode. I got an error when using yarn cluster mode. Could I using
 yarn without start DFS, how could I use this mode?

 Yours,
 Jane

>>>

Re: [spark on yarn] spark on yarn without DFS

2019-05-21 Thread Huizhe Wang

Hi Hari,
Thanks :) I tried to do it as u said. It works ;)


Hariharan 于2019年5月20日 周一下午3:54写道：

> Hi Huizhe,
>
> You can set the "fs.defaultFS" field in core-site.xml to some path on s3.
> That way your spark job will use S3 for all operations that need HDFS.
> Intermediate data will still be stored on local disk though.
>
> Thanks,
> Hari
>
> On Mon, May 20, 2019 at 10:14 AM Abdeali Kothari 
> wrote:
>
>> While spark can read from S3 directly in EMR, I believe it still needs
>> the HDFS to perform shuffles and to write intermediate data into disk when
>> doing jobs (I.e. when the in memory need stop spill over to disk)
>>
>> For these operations, Spark does need a distributed file system - You
>> could use something like EMRFS (which is like a HDFS backed by S3) on
>> Amazon.
>>
>> The issue could be something else too - so a stacktrace or error message
>> could help in understanding the problem.
>>
>>
>>
>> On Mon, May 20, 2019, 07:20 Huizhe Wang  wrote:
>>
>>> Hi,
>>>
>>> I wanna to use Spark on Yarn without HDFS.I store my resource in AWS and
>>> using s3a to get them. However, when I use stop-dfs.sh stoped Namenode and
>>> DataNode. I got an error when using yarn cluster mode. Could I using yarn
>>> without start DFS, how could I use this mode?
>>>
>>> Yours,
>>> Jane
>>>
>>

Re: [spark on yarn] spark on yarn without DFS

2019-05-20 Thread JB Data31

There is a kind of check in the *yarn-site.xml*


*yarn.nodemanager.remote-app-log-dir
/var/yarn/logs*
**

Using *hdfs://:9000* as* fs.defaultFS* in *core-site.xml* you have to *hdfs
dfs -mkdir /var/yarn/logs*
Using *S3://* as * fs.defaultFS*...

Take care of *.dir* properties in* hdfs-site.xml*. Must point to local or
S3 value.

Curious to see *YARN* working without *DFS*.

@*JB*Δ 

Le lun. 20 mai 2019 à 09:54, Hariharan  a écrit :

> Hi Huizhe,
>
> You can set the "fs.defaultFS" field in core-site.xml to some path on s3.
> That way your spark job will use S3 for all operations that need HDFS.
> Intermediate data will still be stored on local disk though.
>
> Thanks,
> Hari
>
> On Mon, May 20, 2019 at 10:14 AM Abdeali Kothari 
> wrote:
>
>> While spark can read from S3 directly in EMR, I believe it still needs
>> the HDFS to perform shuffles and to write intermediate data into disk when
>> doing jobs (I.e. when the in memory need stop spill over to disk)
>>
>> For these operations, Spark does need a distributed file system - You
>> could use something like EMRFS (which is like a HDFS backed by S3) on
>> Amazon.
>>
>> The issue could be something else too - so a stacktrace or error message
>> could help in understanding the problem.
>>
>>
>>
>> On Mon, May 20, 2019, 07:20 Huizhe Wang  wrote:
>>
>>> Hi,
>>>
>>> I wanna to use Spark on Yarn without HDFS.I store my resource in AWS and
>>> using s3a to get them. However, when I use stop-dfs.sh stoped Namenode and
>>> DataNode. I got an error when using yarn cluster mode. Could I using yarn
>>> without start DFS, how could I use this mode?
>>>
>>> Yours,
>>> Jane
>>>
>>

Re: [spark on yarn] spark on yarn without DFS

2019-05-20 Thread Hariharan

Hi Huizhe,

You can set the "fs.defaultFS" field in core-site.xml to some path on s3.
That way your spark job will use S3 for all operations that need HDFS.
Intermediate data will still be stored on local disk though.

Thanks,
Hari

On Mon, May 20, 2019 at 10:14 AM Abdeali Kothari 
wrote:

> While spark can read from S3 directly in EMR, I believe it still needs the
> HDFS to perform shuffles and to write intermediate data into disk when
> doing jobs (I.e. when the in memory need stop spill over to disk)
>
> For these operations, Spark does need a distributed file system - You
> could use something like EMRFS (which is like a HDFS backed by S3) on
> Amazon.
>
> The issue could be something else too - so a stacktrace or error message
> could help in understanding the problem.
>
>
>
> On Mon, May 20, 2019, 07:20 Huizhe Wang  wrote:
>
>> Hi,
>>
>> I wanna to use Spark on Yarn without HDFS.I store my resource in AWS and
>> using s3a to get them. However, when I use stop-dfs.sh stoped Namenode and
>> DataNode. I got an error when using yarn cluster mode. Could I using yarn
>> without start DFS, how could I use this mode?
>>
>> Yours,
>> Jane
>>
>

Re: [spark on yarn] spark on yarn without DFS

2019-05-19 Thread Abdeali Kothari

While spark can read from S3 directly in EMR, I believe it still needs the
HDFS to perform shuffles and to write intermediate data into disk when
doing jobs (I.e. when the in memory need stop spill over to disk)

For these operations, Spark does need a distributed file system - You could
use something like EMRFS (which is like a HDFS backed by S3) on Amazon.

The issue could be something else too - so a stacktrace or error message
could help in understanding the problem.

On Mon, May 20, 2019, 07:20 Huizhe Wang  wrote:

> Hi,
>
> I wanna to use Spark on Yarn without HDFS.I store my resource in AWS and
> using s3a to get them. However, when I use stop-dfs.sh stoped Namenode and
> DataNode. I got an error when using yarn cluster mode. Could I using yarn
> without start DFS, how could I use this mode?
>
> Yours,
> Jane
>

Re: [spark on yarn] spark on yarn without DFS

2019-05-19 Thread Jeff Zhang

I am afraid not, because yarn needs dfs.

Huizhe Wang  于2019年5月20日周一 上午9:50写道：

> Hi,
>
> I wanna to use Spark on Yarn without HDFS.I store my resource in AWS and
> using s3a to get them. However, when I use stop-dfs.sh stoped Namenode and
> DataNode. I got an error when using yarn cluster mode. Could I using yarn
> without start DFS, how could I use this mode?
>
> Yours,
> Jane
>


-- 
Best Regards

Jeff Zhang

Re: Spark on yarn - application hangs

2019-05-10 Thread Mich Talebzadeh

sure NP.

I meant these topics

[image: image.png]

Have a look at this article of mine

https://www.linkedin.com/pulse/real-time-processing-trade-data-kafka-flume-spark-talebzadeh-ph-d-/

under section

Understanding the Spark Application Through Visualization

See if it helps

HTH

Dr Mich Talebzadeh

LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*

http://talebzadehmich.wordpress.com

*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

On Fri, 10 May 2019 at 18:10, Mkal  wrote:

> How can i check what exactly is stagnant? Do you mean on the DAG
> visualization on Spark UI?
>
> Sorry i'm new to spark.
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: Spark on yarn - application hangs

2019-05-10 Thread Mkal

How can i check what exactly is stagnant? Do you mean on the DAG
visualization on Spark UI?

Sorry i'm new to spark.



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark on yarn - application hangs

2019-05-10 Thread Mich Talebzadeh

Hi,

Have you checked matrices from Spark UI by any chance? What is stagnant?

HTH

Dr Mich Talebzadeh

LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*

http://talebzadehmich.wordpress.com

*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

On Fri, 10 May 2019 at 17:51, Mkal  wrote:

> I've built a spark job in which an external program is called through the
> use
> of pipe().
> Job runs correctly on cluster when the input is a small sample dataset but
> when the input is a real large dataset it stays on RUNNING state forever.
>
> I've tried different ways to tune executor memory, executor cores, overhead
> memory but havent found a solution so far.
> I've also tried to force external program to use only 1 thread in case
> there
> is a problem due to it being a multithread application but nothing.
>
> Any suggestion would be welcome
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: Spark on YARN, HowTo kill executor or individual task?

2019-02-12 Thread Vadim Semenov

Yeah, then the easiest would be to fork spark and run using the forked
version, and in case of YARN it should be pretty easy to do.

git clone https://github.com/apache/spark.git

cd spark

export MAVEN_OPTS="-Xmx4g -XX:ReservedCodeCacheSize=512m"

./build/mvn -DskipTests clean package

./dev/make-distribution.sh --name custom-spark --tgz -Phadoop-2.7 -Phive
-Pyarn

ls -la spark-2.4.0-SNAPSHOT-bin-custom-spark.tgz

scp spark-2.4.0-SNAPSHOT-bin-custom-spark.tgz cluster:/tmp

export SPARK_HOME="/tmp/spark-2.3.0-SNAPSHOT-bin-custom-spark"

cd $SPARK_HOME
mv conf conf.new
ln -s /etc/spark/conf conf

echo $SPARK_HOME
spark-submit --version

On Tue, Feb 12, 2019 at 6:40 AM Serega Sheypak 
wrote:
>
> I tried a similar approach, it works well for user functions. but I need
to crash tasks or executor when spark application runs "repartition". I
didn't any away to inject "poison pill" into repartition call :(
>
> пн, 11 февр. 2019 г. в 21:19, Vadim Semenov :
>>
>> something like this
>>
>> import org.apache.spark.TaskContext
>> ds.map(r => {
>>   val taskContext = TaskContext.get()
>>   if (taskContext.partitionId == 1000) {
>> throw new RuntimeException
>>   }
>>   r
>> })
>>
>> On Mon, Feb 11, 2019 at 8:41 AM Serega Sheypak 
wrote:
>> >
>> > I need to crash task which does repartition.
>> >
>> > пн, 11 февр. 2019 г. в 10:37, Gabor Somogyi :
>> >>
>> >> What blocks you to put if conditions inside the mentioned map
function?
>> >>
>> >> On Mon, Feb 11, 2019 at 10:31 AM Serega Sheypak <
serega.shey...@gmail.com> wrote:
>> >>>
>> >>> Yeah, but I don't need to crash entire app, I want to fail several
tasks or executors and then wait for completion.
>> >>>
>> >>> вс, 10 февр. 2019 г. в 21:49, Gabor Somogyi <
gabor.g.somo...@gmail.com>:
>> 
>>  Another approach is adding artificial exception into the
application's source code like this:
>> 
>>  val query = input.toDS.map(_ /
0).writeStream.format("console").start()
>> 
>>  G
>> 
>> 
>>  On Sun, Feb 10, 2019 at 9:36 PM Serega Sheypak <
serega.shey...@gmail.com> wrote:
>> >
>> > Hi BR,
>> > thanks for your reply. I want to mimic the issue and kill tasks at
a certain stage. Killing executor is also an option for me.
>> > I'm curious how do core spark contributors test spark fault
tolerance?
>> >
>> >
>> > вс, 10 февр. 2019 г. в 16:57, Gabor Somogyi <
gabor.g.somo...@gmail.com>:
>> >>
>> >> Hi Serega,
>> >>
>> >> If I understand your problem correctly you would like to kill one
executor only and the rest of the app has to be untouched.
>> >> If that's true yarn -kill is not what you want because it stops
the whole application.
>> >>
>> >> I've done similar thing when tested/testing Spark's HA features.
>> >> - jps -vlm | grep
"org.apache.spark.executor.CoarseGrainedExecutorBackend.*applicationid"
>> >> - kill -9 pidofoneexecutor
>> >>
>> >> Be aware if it's a multi-node cluster check whether at least one
process runs on a specific node(it's not required).
>> >> Happy killing...
>> >>
>> >> BR,
>> >> G
>> >>
>> >>
>> >> On Sun, Feb 10, 2019 at 4:19 PM Jörn Franke 
wrote:
>> >>>
>> >>> yarn application -kill applicationid ?
>> >>>
>> >>> > Am 10.02.2019 um 13:30 schrieb Serega Sheypak <
serega.shey...@gmail.com>:
>> >>> >
>> >>> > Hi there!
>> >>> > I have weird issue that appears only when tasks fail at
specific stage. I would like to imitate failure on my own.
>> >>> > The plan is to run problematic app and then kill entire
executor or some tasks when execution reaches certain stage.
>> >>> >
>> >>> > Is it do-able?
>> >>>
>> >>>
-
>> >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>> >>>
>>
>>
>> --
>> Sent from my iPhone



-- 
Sent from my iPhone

Re: Spark on YARN, HowTo kill executor or individual task?

2019-02-12 Thread Serega Sheypak

I tried a similar approach, it works well for user functions. but I need to
crash tasks or executor when spark application runs "repartition". I didn't
any away to inject "poison pill" into repartition call :(

пн, 11 февр. 2019 г. в 21:19, Vadim Semenov :

> something like this
>
> import org.apache.spark.TaskContext
> ds.map(r => {
>   val taskContext = TaskContext.get()
>   if (taskContext.partitionId == 1000) {
> throw new RuntimeException
>   }
>   r
> })
>
> On Mon, Feb 11, 2019 at 8:41 AM Serega Sheypak 
> wrote:
> >
> > I need to crash task which does repartition.
> >
> > пн, 11 февр. 2019 г. в 10:37, Gabor Somogyi :
> >>
> >> What blocks you to put if conditions inside the mentioned map function?
> >>
> >> On Mon, Feb 11, 2019 at 10:31 AM Serega Sheypak <
> serega.shey...@gmail.com> wrote:
> >>>
> >>> Yeah, but I don't need to crash entire app, I want to fail several
> tasks or executors and then wait for completion.
> >>>
> >>> вс, 10 февр. 2019 г. в 21:49, Gabor Somogyi  >:
> 
>  Another approach is adding artificial exception into the
> application's source code like this:
> 
>  val query = input.toDS.map(_ /
> 0).writeStream.format("console").start()
> 
>  G
> 
> 
>  On Sun, Feb 10, 2019 at 9:36 PM Serega Sheypak <
> serega.shey...@gmail.com> wrote:
> >
> > Hi BR,
> > thanks for your reply. I want to mimic the issue and kill tasks at a
> certain stage. Killing executor is also an option for me.
> > I'm curious how do core spark contributors test spark fault
> tolerance?
> >
> >
> > вс, 10 февр. 2019 г. в 16:57, Gabor Somogyi <
> gabor.g.somo...@gmail.com>:
> >>
> >> Hi Serega,
> >>
> >> If I understand your problem correctly you would like to kill one
> executor only and the rest of the app has to be untouched.
> >> If that's true yarn -kill is not what you want because it stops the
> whole application.
> >>
> >> I've done similar thing when tested/testing Spark's HA features.
> >> - jps -vlm | grep
> "org.apache.spark.executor.CoarseGrainedExecutorBackend.*applicationid"
> >> - kill -9 pidofoneexecutor
> >>
> >> Be aware if it's a multi-node cluster check whether at least one
> process runs on a specific node(it's not required).
> >> Happy killing...
> >>
> >> BR,
> >> G
> >>
> >>
> >> On Sun, Feb 10, 2019 at 4:19 PM Jörn Franke 
> wrote:
> >>>
> >>> yarn application -kill applicationid ?
> >>>
> >>> > Am 10.02.2019 um 13:30 schrieb Serega Sheypak <
> serega.shey...@gmail.com>:
> >>> >
> >>> > Hi there!
> >>> > I have weird issue that appears only when tasks fail at specific
> stage. I would like to imitate failure on my own.
> >>> > The plan is to run problematic app and then kill entire executor
> or some tasks when execution reaches certain stage.
> >>> >
> >>> > Is it do-able?
> >>>
> >>>
> -
> >>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
> >>>
>
>
> --
> Sent from my iPhone
>

Re: Spark on YARN, HowTo kill executor or individual task?

2019-02-11 Thread Vadim Semenov

something like this

import org.apache.spark.TaskContext
ds.map(r => {
  val taskContext = TaskContext.get()
  if (taskContext.partitionId == 1000) {
throw new RuntimeException
  }
  r
})

On Mon, Feb 11, 2019 at 8:41 AM Serega Sheypak  wrote:
>
> I need to crash task which does repartition.
>
> пн, 11 февр. 2019 г. в 10:37, Gabor Somogyi :
>>
>> What blocks you to put if conditions inside the mentioned map function?
>>
>> On Mon, Feb 11, 2019 at 10:31 AM Serega Sheypak  
>> wrote:
>>>
>>> Yeah, but I don't need to crash entire app, I want to fail several tasks or 
>>> executors and then wait for completion.
>>>
>>> вс, 10 февр. 2019 г. в 21:49, Gabor Somogyi :

 Another approach is adding artificial exception into the application's 
 source code like this:

 val query = input.toDS.map(_ / 0).writeStream.format("console").start()

 G


 On Sun, Feb 10, 2019 at 9:36 PM Serega Sheypak  
 wrote:
>
> Hi BR,
> thanks for your reply. I want to mimic the issue and kill tasks at a 
> certain stage. Killing executor is also an option for me.
> I'm curious how do core spark contributors test spark fault tolerance?
>
>
> вс, 10 февр. 2019 г. в 16:57, Gabor Somogyi :
>>
>> Hi Serega,
>>
>> If I understand your problem correctly you would like to kill one 
>> executor only and the rest of the app has to be untouched.
>> If that's true yarn -kill is not what you want because it stops the 
>> whole application.
>>
>> I've done similar thing when tested/testing Spark's HA features.
>> - jps -vlm | grep 
>> "org.apache.spark.executor.CoarseGrainedExecutorBackend.*applicationid"
>> - kill -9 pidofoneexecutor
>>
>> Be aware if it's a multi-node cluster check whether at least one process 
>> runs on a specific node(it's not required).
>> Happy killing...
>>
>> BR,
>> G
>>
>>
>> On Sun, Feb 10, 2019 at 4:19 PM Jörn Franke  wrote:
>>>
>>> yarn application -kill applicationid ?
>>>
>>> > Am 10.02.2019 um 13:30 schrieb Serega Sheypak 
>>> > :
>>> >
>>> > Hi there!
>>> > I have weird issue that appears only when tasks fail at specific 
>>> > stage. I would like to imitate failure on my own.
>>> > The plan is to run problematic app and then kill entire executor or 
>>> > some tasks when execution reaches certain stage.
>>> >
>>> > Is it do-able?
>>>
>>> -
>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>


-- 
Sent from my iPhone

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark on YARN, HowTo kill executor or individual task?

2019-02-11 Thread Serega Sheypak

I need to crash task which does repartition.

пн, 11 февр. 2019 г. в 10:37, Gabor Somogyi :

> What blocks you to put if conditions inside the mentioned map function?
>
> On Mon, Feb 11, 2019 at 10:31 AM Serega Sheypak 
> wrote:
>
>> Yeah, but I don't need to crash entire app, I want to fail several tasks
>> or executors and then wait for completion.
>>
>> вс, 10 февр. 2019 г. в 21:49, Gabor Somogyi :
>>
>>> Another approach is adding artificial exception into the application's
>>> source code like this:
>>>
>>> val query = input.toDS.map(_ / 0).writeStream.format("console").start()
>>>
>>> G
>>>
>>>
>>> On Sun, Feb 10, 2019 at 9:36 PM Serega Sheypak 
>>> wrote:
>>>
 Hi BR,
 thanks for your reply. I want to mimic the issue and kill tasks at a
 certain stage. Killing executor is also an option for me.
 I'm curious how do core spark contributors test spark fault tolerance?


 вс, 10 февр. 2019 г. в 16:57, Gabor Somogyi >>> >:

> Hi Serega,
>
> If I understand your problem correctly you would like to kill one
> executor only and the rest of the app has to be untouched.
> If that's true yarn -kill is not what you want because it stops the
> whole application.
>
> I've done similar thing when tested/testing Spark's HA features.
> - jps -vlm | grep
> "org.apache.spark.executor.CoarseGrainedExecutorBackend.*applicationid"
> - kill -9 pidofoneexecutor
>
> Be aware if it's a multi-node cluster check whether at least one
> process runs on a specific node(it's not required).
> Happy killing...
>
> BR,
> G
>
>
> On Sun, Feb 10, 2019 at 4:19 PM Jörn Franke 
> wrote:
>
>> yarn application -kill applicationid ?
>>
>> > Am 10.02.2019 um 13:30 schrieb Serega Sheypak <
>> serega.shey...@gmail.com>:
>> >
>> > Hi there!
>> > I have weird issue that appears only when tasks fail at specific
>> stage. I would like to imitate failure on my own.
>> > The plan is to run problematic app and then kill entire executor or
>> some tasks when execution reaches certain stage.
>> >
>> > Is it do-able?
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>

Re: Spark on YARN, HowTo kill executor or individual task?

2019-02-11 Thread Gabor Somogyi

What blocks you to put if conditions inside the mentioned map function?

On Mon, Feb 11, 2019 at 10:31 AM Serega Sheypak 
wrote:

> Yeah, but I don't need to crash entire app, I want to fail several tasks
> or executors and then wait for completion.
>
> вс, 10 февр. 2019 г. в 21:49, Gabor Somogyi :
>
>> Another approach is adding artificial exception into the application's
>> source code like this:
>>
>> val query = input.toDS.map(_ / 0).writeStream.format("console").start()
>>
>> G
>>
>>
>> On Sun, Feb 10, 2019 at 9:36 PM Serega Sheypak 
>> wrote:
>>
>>> Hi BR,
>>> thanks for your reply. I want to mimic the issue and kill tasks at a
>>> certain stage. Killing executor is also an option for me.
>>> I'm curious how do core spark contributors test spark fault tolerance?
>>>
>>>
>>> вс, 10 февр. 2019 г. в 16:57, Gabor Somogyi :
>>>
 Hi Serega,

 If I understand your problem correctly you would like to kill one
 executor only and the rest of the app has to be untouched.
 If that's true yarn -kill is not what you want because it stops the
 whole application.

 I've done similar thing when tested/testing Spark's HA features.
 - jps -vlm | grep
 "org.apache.spark.executor.CoarseGrainedExecutorBackend.*applicationid"
 - kill -9 pidofoneexecutor

 Be aware if it's a multi-node cluster check whether at least one
 process runs on a specific node(it's not required).
 Happy killing...

 BR,
 G


 On Sun, Feb 10, 2019 at 4:19 PM Jörn Franke 
 wrote:

> yarn application -kill applicationid ?
>
> > Am 10.02.2019 um 13:30 schrieb Serega Sheypak <
> serega.shey...@gmail.com>:
> >
> > Hi there!
> > I have weird issue that appears only when tasks fail at specific
> stage. I would like to imitate failure on my own.
> > The plan is to run problematic app and then kill entire executor or
> some tasks when execution reaches certain stage.
> >
> > Is it do-able?
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: Spark on YARN, HowTo kill executor or individual task?

2019-02-11 Thread Serega Sheypak

Yeah, but I don't need to crash entire app, I want to fail several tasks or
executors and then wait for completion.

вс, 10 февр. 2019 г. в 21:49, Gabor Somogyi :

> Another approach is adding artificial exception into the application's
> source code like this:
>
> val query = input.toDS.map(_ / 0).writeStream.format("console").start()
>
> G
>
>
> On Sun, Feb 10, 2019 at 9:36 PM Serega Sheypak 
> wrote:
>
>> Hi BR,
>> thanks for your reply. I want to mimic the issue and kill tasks at a
>> certain stage. Killing executor is also an option for me.
>> I'm curious how do core spark contributors test spark fault tolerance?
>>
>>
>> вс, 10 февр. 2019 г. в 16:57, Gabor Somogyi :
>>
>>> Hi Serega,
>>>
>>> If I understand your problem correctly you would like to kill one
>>> executor only and the rest of the app has to be untouched.
>>> If that's true yarn -kill is not what you want because it stops the
>>> whole application.
>>>
>>> I've done similar thing when tested/testing Spark's HA features.
>>> - jps -vlm | grep
>>> "org.apache.spark.executor.CoarseGrainedExecutorBackend.*applicationid"
>>> - kill -9 pidofoneexecutor
>>>
>>> Be aware if it's a multi-node cluster check whether at least one process
>>> runs on a specific node(it's not required).
>>> Happy killing...
>>>
>>> BR,
>>> G
>>>
>>>
>>> On Sun, Feb 10, 2019 at 4:19 PM Jörn Franke 
>>> wrote:
>>>
 yarn application -kill applicationid ?

 > Am 10.02.2019 um 13:30 schrieb Serega Sheypak <
 serega.shey...@gmail.com>:
 >
 > Hi there!
 > I have weird issue that appears only when tasks fail at specific
 stage. I would like to imitate failure on my own.
 > The plan is to run problematic app and then kill entire executor or
 some tasks when execution reaches certain stage.
 >
 > Is it do-able?

 -
 To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark on YARN, HowTo kill executor or individual task?

2019-02-10 Thread Gabor Somogyi

Another approach is adding artificial exception into the application's
source code like this:

val query = input.toDS.map(_ / 0).writeStream.format("console").start()

G


On Sun, Feb 10, 2019 at 9:36 PM Serega Sheypak 
wrote:

> Hi BR,
> thanks for your reply. I want to mimic the issue and kill tasks at a
> certain stage. Killing executor is also an option for me.
> I'm curious how do core spark contributors test spark fault tolerance?
>
>
> вс, 10 февр. 2019 г. в 16:57, Gabor Somogyi :
>
>> Hi Serega,
>>
>> If I understand your problem correctly you would like to kill one
>> executor only and the rest of the app has to be untouched.
>> If that's true yarn -kill is not what you want because it stops the whole
>> application.
>>
>> I've done similar thing when tested/testing Spark's HA features.
>> - jps -vlm | grep
>> "org.apache.spark.executor.CoarseGrainedExecutorBackend.*applicationid"
>> - kill -9 pidofoneexecutor
>>
>> Be aware if it's a multi-node cluster check whether at least one process
>> runs on a specific node(it's not required).
>> Happy killing...
>>
>> BR,
>> G
>>
>>
>> On Sun, Feb 10, 2019 at 4:19 PM Jörn Franke  wrote:
>>
>>> yarn application -kill applicationid ?
>>>
>>> > Am 10.02.2019 um 13:30 schrieb Serega Sheypak <
>>> serega.shey...@gmail.com>:
>>> >
>>> > Hi there!
>>> > I have weird issue that appears only when tasks fail at specific
>>> stage. I would like to imitate failure on my own.
>>> > The plan is to run problematic app and then kill entire executor or
>>> some tasks when execution reaches certain stage.
>>> >
>>> > Is it do-able?
>>>
>>> -
>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>
>>>

Re: Spark on YARN, HowTo kill executor or individual task?

2019-02-10 Thread Serega Sheypak

Hi BR,
thanks for your reply. I want to mimic the issue and kill tasks at a
certain stage. Killing executor is also an option for me.
I'm curious how do core spark contributors test spark fault tolerance?


вс, 10 февр. 2019 г. в 16:57, Gabor Somogyi :

> Hi Serega,
>
> If I understand your problem correctly you would like to kill one executor
> only and the rest of the app has to be untouched.
> If that's true yarn -kill is not what you want because it stops the whole
> application.
>
> I've done similar thing when tested/testing Spark's HA features.
> - jps -vlm | grep
> "org.apache.spark.executor.CoarseGrainedExecutorBackend.*applicationid"
> - kill -9 pidofoneexecutor
>
> Be aware if it's a multi-node cluster check whether at least one process
> runs on a specific node(it's not required).
> Happy killing...
>
> BR,
> G
>
>
> On Sun, Feb 10, 2019 at 4:19 PM Jörn Franke  wrote:
>
>> yarn application -kill applicationid ?
>>
>> > Am 10.02.2019 um 13:30 schrieb Serega Sheypak > >:
>> >
>> > Hi there!
>> > I have weird issue that appears only when tasks fail at specific stage.
>> I would like to imitate failure on my own.
>> > The plan is to run problematic app and then kill entire executor or
>> some tasks when execution reaches certain stage.
>> >
>> > Is it do-able?
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>

Re: Spark on YARN, HowTo kill executor or individual task?

2019-02-10 Thread Gabor Somogyi

Hi Serega,

If I understand your problem correctly you would like to kill one executor
only and the rest of the app has to be untouched.
If that's true yarn -kill is not what you want because it stops the whole
application.

I've done similar thing when tested/testing Spark's HA features.
- jps -vlm | grep
"org.apache.spark.executor.CoarseGrainedExecutorBackend.*applicationid"
- kill -9 pidofoneexecutor

Be aware if it's a multi-node cluster check whether at least one process
runs on a specific node(it's not required).
Happy killing...

BR,
G

On Sun, Feb 10, 2019 at 4:19 PM Jörn Franke  wrote:

> yarn application -kill applicationid ?
>
> > Am 10.02.2019 um 13:30 schrieb Serega Sheypak  >:
> >
> > Hi there!
> > I have weird issue that appears only when tasks fail at specific stage.
> I would like to imitate failure on my own.
> > The plan is to run problematic app and then kill entire executor or some
> tasks when execution reaches certain stage.
> >
> > Is it do-able?
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: Spark on YARN, HowTo kill executor or individual task?

2019-02-10 Thread Jörn Franke

yarn application -kill applicationid ?

> Am 10.02.2019 um 13:30 schrieb Serega Sheypak :
> 
> Hi there!
> I have weird issue that appears only when tasks fail at specific stage. I 
> would like to imitate failure on my own. 
> The plan is to run problematic app and then kill entire executor or some 
> tasks when execution reaches certain stage.
> 
> Is it do-able? 

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark on Yarn, is it possible to manually blacklist nodes before running spark job?

2019-01-23 Thread Serega Sheypak

Hi Imran,
here is my usecase
There is 1K nodes cluster and jobs have performance degradation because of
a single node. It's rather hard to convince Cluster Ops to decommission
node because of "performance degradation". Imagine 10 dev teams chase
single ops team for valid reason (node has problems) or because code has a
bug or data is skewed or spots on the sun. We can't just decommission node
because random dev complains.

Simple solution:
- rerun failed / delayed job and blacklist "problematic" node in advance.
- Report about the problem if job works w/o anomalies.
- ops collect complains about node and start to decommission it when
"complains threshold" is reached. It's a rather low probability that many
loosely coupled teams with loosely coupled jobs complain about a single
node.


Results
- Ops are not spammed with a random requests from devs
- devs are not blocked because of the really bad node.
- it's very cheap for everyone to "blacklist" node during job submission
w/o doing anything to node.
- it's very easy to automate such behavior. Many teams use 100500 kinds of
workflow runners and the strategy is dead simple (depends on SLA of
course).
  - Just re-run failed job excluding nodes with failed tasks (if number of
nodes is reasonable)
  - Kill stuck job if it runs longer than XXX minutes and re-start
excluding nodes with long-running tasks.



ср, 23 янв. 2019 г. в 23:09, Imran Rashid :

> Serga, can you explain a bit more why you want this ability?
> If the node is really bad, wouldn't you want to decomission the NM
> entirely?
> If you've got heterogenous resources, than nodelabels seem like they would
> be more appropriate -- and I don't feel great about adding workarounds for
> the node-label limitations into blacklisting.
>
> I don't want to be stuck supporting a configuration with too limited a use
> case.
>
> (may be better to move discussion to
> https://issues.apache.org/jira/browse/SPARK-26688 so its better archived,
> I'm responding here in case you aren't watching that issue)
>
> On Tue, Jan 22, 2019 at 6:09 AM Jörn Franke  wrote:
>
>> You can try with Yarn node labels:
>>
>> https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
>>
>> Then you can whitelist nodes.
>>
>> Am 19.01.2019 um 00:20 schrieb Serega Sheypak :
>>
>> Hi, is there any possibility to tell Scheduler to blacklist specific
>> nodes in advance?
>>
>>

Re: Spark on Yarn, is it possible to manually blacklist nodes before running spark job?

2019-01-23 Thread Imran Rashid

Serga, can you explain a bit more why you want this ability?
If the node is really bad, wouldn't you want to decomission the NM entirely?
If you've got heterogenous resources, than nodelabels seem like they would
be more appropriate -- and I don't feel great about adding workarounds for
the node-label limitations into blacklisting.

I don't want to be stuck supporting a configuration with too limited a use
case.

(may be better to move discussion to
https://issues.apache.org/jira/browse/SPARK-26688 so its better archived,
I'm responding here in case you aren't watching that issue)

On Tue, Jan 22, 2019 at 6:09 AM Jörn Franke  wrote:

> You can try with Yarn node labels:
>
> https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
>
> Then you can whitelist nodes.
>
> Am 19.01.2019 um 00:20 schrieb Serega Sheypak :
>
> Hi, is there any possibility to tell Scheduler to blacklist specific nodes
> in advance?
>
>

Re: Spark on Yarn, is it possible to manually blacklist nodes before running spark job?

2019-01-22 Thread Jörn Franke

You can try with Yarn node labels:
https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/NodeLabel.html

Then you can whitelist nodes.

> Am 19.01.2019 um 00:20 schrieb Serega Sheypak :
> 
> Hi, is there any possibility to tell Scheduler to blacklist specific nodes in 
> advance?

Re: Spark on Yarn, is it possible to manually blacklist nodes before running spark job?

2019-01-22 Thread Attila Zsolt Piros

The new issue is https://issues.apache.org/jira/browse/SPARK-26688.


On Tue, Jan 22, 2019 at 11:30 AM Attila Zsolt Piros 
wrote:

> Hi,
>
> >> Is it this one: https://github.com/apache/spark/pull/23223 ?
>
> No. My old development was https://github.com/apache/spark/pull/21068,
> which is closed.
>
> This would be a new improvement with a new Apache JIRA issue (
> https://issues.apache.org) and with a new Github pull request.
>
> >> Can I try to reach you through Cloudera Support portal?
>
> It is not needed. This would be an improvement into the Apache Spark which
> details can be discussed in the JIRA / Github PR.
>
> Attila
>
>
> On Mon, Jan 21, 2019 at 10:18 PM Serega Sheypak 
> wrote:
>
>> Hi Apiros, thanks for your reply.
>>
>> Is it this one: https://github.com/apache/spark/pull/23223 ?
>> Can I try to reach you through Cloudera Support portal?
>>
>> пн, 21 янв. 2019 г. в 20:06, attilapiros :
>>
>>> Hello, I was working on this area last year (I have developed the
>>> YarnAllocatorBlacklistTracker) and if you haven't found any solution for
>>> your problem I can introduce a new config which would contain a sequence
>>> of
>>> always blacklisted nodes. This way blacklisting would improve a bit
>>> again :)
>>>
>>>
>>>
>>> --
>>> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>>>
>>> -
>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>>
>>>

Re: Spark on Yarn, is it possible to manually blacklist nodes before running spark job?

2019-01-22 Thread Attila Zsolt Piros

Hi,

>> Is it this one: https://github.com/apache/spark/pull/23223 ?

No. My old development was https://github.com/apache/spark/pull/21068,
which is closed.

This would be a new improvement with a new Apache JIRA issue (
https://issues.apache.org) and with a new Github pull request.

>> Can I try to reach you through Cloudera Support portal?

It is not needed. This would be an improvement into the Apache Spark which
details can be discussed in the JIRA / Github PR.

Attila

On Mon, Jan 21, 2019 at 10:18 PM Serega Sheypak 
wrote:

> Hi Apiros, thanks for your reply.
>
> Is it this one: https://github.com/apache/spark/pull/23223 ?
> Can I try to reach you through Cloudera Support portal?
>
> пн, 21 янв. 2019 г. в 20:06, attilapiros :
>
>> Hello, I was working on this area last year (I have developed the
>> YarnAllocatorBlacklistTracker) and if you haven't found any solution for
>> your problem I can introduce a new config which would contain a sequence
>> of
>> always blacklisted nodes. This way blacklisting would improve a bit again
>> :)
>>
>>
>>
>> --
>> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>>
>> -
>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>>
>>

Re: Spark on Yarn, is it possible to manually blacklist nodes before running spark job?

2019-01-21 Thread Serega Sheypak

Hi Apiros, thanks for your reply.

Is it this one: https://github.com/apache/spark/pull/23223 ?
Can I try to reach you through Cloudera Support portal?

пн, 21 янв. 2019 г. в 20:06, attilapiros :

> Hello, I was working on this area last year (I have developed the
> YarnAllocatorBlacklistTracker) and if you haven't found any solution for
> your problem I can introduce a new config which would contain a sequence of
> always blacklisted nodes. This way blacklisting would improve a bit again
> :)
>
>
>
> --
> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: Spark on Yarn, is it possible to manually blacklist nodes before running spark job?

2019-01-21 Thread attilapiros

Hello, I was working on this area last year (I have developed the
YarnAllocatorBlacklistTracker) and if you haven't found any solution for
your problem I can introduce a new config which would contain a sequence of
always blacklisted nodes. This way blacklisting would improve a bit again :)



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark on Yarn, is it possible to manually blacklist nodes before running spark job?

2019-01-20 Thread Serega Sheypak

Thanks, so I'll check YARN.
Does anyone know if Spark-on-Yarn plans to expose such functionality?

сб, 19 янв. 2019 г. в 18:04, Felix Cheung :

> To clarify, yarn actually supports excluding node right when requesting
> resources. It’s spark that doesn’t provide a way to populate such a
> blacklist.
>
> If you can change yarn config, the equivalent is node label:
> https://hadoop.apache.org/docs/r2.7.4/hadoop-yarn/hadoop-yarn-site/NodeLabel.html
>
>
>
> --
> *From:* Li Gao 
> *Sent:* Saturday, January 19, 2019 8:43 AM
> *To:* Felix Cheung
> *Cc:* Serega Sheypak; user
> *Subject:* Re: Spark on Yarn, is it possible to manually blacklist nodes
> before running spark job?
>
> on yarn it is impossible afaik. on kubernetes you can use taints to keep
> certain nodes outside of spark
>
> On Fri, Jan 18, 2019 at 9:35 PM Felix Cheung 
> wrote:
>
>> Not as far as I recall...
>>
>>
>> --
>> *From:* Serega Sheypak 
>> *Sent:* Friday, January 18, 2019 3:21 PM
>> *To:* user
>> *Subject:* Spark on Yarn, is it possible to manually blacklist nodes
>> before running spark job?
>>
>> Hi, is there any possibility to tell Scheduler to blacklist specific
>> nodes in advance?
>>
>

Re: Spark on Yarn, is it possible to manually blacklist nodes before running spark job?

2019-01-19 Thread Felix Cheung

To clarify, yarn actually supports excluding node right when requesting 
resources. It’s spark that doesn’t provide a way to populate such a blacklist.

If you can change yarn config, the equivalent is node label: 
https://hadoop.apache.org/docs/r2.7.4/hadoop-yarn/hadoop-yarn-site/NodeLabel.html




From: Li Gao 
Sent: Saturday, January 19, 2019 8:43 AM
To: Felix Cheung
Cc: Serega Sheypak; user
Subject: Re: Spark on Yarn, is it possible to manually blacklist nodes before 
running spark job?

on yarn it is impossible afaik. on kubernetes you can use taints to keep 
certain nodes outside of spark

On Fri, Jan 18, 2019 at 9:35 PM Felix Cheung 
mailto:felixcheun...@hotmail.com>> wrote:
Not as far as I recall...



From: Serega Sheypak mailto:serega.shey...@gmail.com>>
Sent: Friday, January 18, 2019 3:21 PM
To: user
Subject: Spark on Yarn, is it possible to manually blacklist nodes before 
running spark job?

Hi, is there any possibility to tell Scheduler to blacklist specific nodes in 
advance?

Re: Spark on Yarn, is it possible to manually blacklist nodes before running spark job?

2019-01-19 Thread Li Gao

on yarn it is impossible afaik. on kubernetes you can use taints to keep
certain nodes outside of spark

On Fri, Jan 18, 2019 at 9:35 PM Felix Cheung 
wrote:

> Not as far as I recall...
>
>
> --
> *From:* Serega Sheypak 
> *Sent:* Friday, January 18, 2019 3:21 PM
> *To:* user
> *Subject:* Spark on Yarn, is it possible to manually blacklist nodes
> before running spark job?
>
> Hi, is there any possibility to tell Scheduler to blacklist specific nodes
> in advance?
>

Re: Spark on Yarn, is it possible to manually blacklist nodes before running spark job?

2019-01-18 Thread Felix Cheung

Not as far as I recall...



From: Serega Sheypak 
Sent: Friday, January 18, 2019 3:21 PM
To: user
Subject: Spark on Yarn, is it possible to manually blacklist nodes before 
running spark job?

Hi, is there any possibility to tell Scheduler to blacklist specific nodes in 
advance?

Re: Spark on YARN not utilizing all the YARN containers available

2018-10-10 Thread Gourav Sengupta

Hi Dillon,

yes we can understand the number of executors that are running but the
question is more around understanding the relation between YARN containers,
their persistence and SPARK excutors.

Regards,
Gourav

On Wed, Oct 10, 2018 at 6:38 AM Dillon Dukek 
wrote:

> There is documentation here
> http://spark.apache.org/docs/latest/running-on-yarn.html about running
> spark on YARN. Like I said before you can use either the logs from the
> application or the Spark UI to understand how many executors are running at
> any given time. I don't think I can help much further without more
> information about the specific use case.
>
>
> On Tue, Oct 9, 2018 at 2:54 PM Gourav Sengupta 
> wrote:
>
>> Hi Dillon,
>>
>> I do think that there is a setting available where in once YARN sets up
>> the containers then you do not deallocate them, I had used it previously in
>> HIVE, and it just saves processing time in terms of allocating containers.
>> That said I am still trying to understand how do we determine one YARN
>> container = one executor in SPARK.
>>
>> Regards,
>> Gourav
>>
>> On Tue, Oct 9, 2018 at 9:04 PM Dillon Dukek
>>  wrote:
>>
>>> I'm still not sure exactly what you are meaning by saying that you have
>>> 6 yarn containers. Yarn should just be aware of the total available
>>> resources in  your cluster and then be able to launch containers based on
>>> the executor requirements you set when you submit your job. If you can, I
>>> think it would be helpful to send me the command you're using to launch
>>> your spark process. You should also be able to use the logs and/or the
>>> spark UI to determine how many executors are running.
>>>
>>> On Tue, Oct 9, 2018 at 12:57 PM Gourav Sengupta <
>>> gourav.sengu...@gmail.com> wrote:
>>>
 hi,

 may be I am not quite clear in my head on this one. But how do we know
 that 1 yarn container = 1 executor?

 Regards,
 Gourav Sengupta

 On Tue, Oct 9, 2018 at 8:53 PM Dillon Dukek
  wrote:

> Can you send how you are launching your streaming process? Also what
> environment is this cluster running in (EMR, GCP, self managed, etc)?
>
> On Tue, Oct 9, 2018 at 10:21 AM kant kodali 
> wrote:
>
>> Hi All,
>>
>> I am using Spark 2.3.1 and using YARN as a cluster manager.
>>
>> I currently got
>>
>> 1) 6 YARN containers(executors=6) with 4 executor cores for each
>> container.
>> 2) 6 Kafka partitions from one topic.
>> 3) You can assume every other configuration is set to whatever the
>> default values are.
>>
>> Spawned a Simple Streaming Query and I see all the tasks get
>> scheduled on one YARN container. am I missing any config?
>>
>> Thanks!
>>
>

Re: Spark on YARN not utilizing all the YARN containers available

2018-10-09 Thread Dillon Dukek

There is documentation here
http://spark.apache.org/docs/latest/running-on-yarn.html about running
spark on YARN. Like I said before you can use either the logs from the
application or the Spark UI to understand how many executors are running at
any given time. I don't think I can help much further without more
information about the specific use case.


On Tue, Oct 9, 2018 at 2:54 PM Gourav Sengupta 
wrote:

> Hi Dillon,
>
> I do think that there is a setting available where in once YARN sets up
> the containers then you do not deallocate them, I had used it previously in
> HIVE, and it just saves processing time in terms of allocating containers.
> That said I am still trying to understand how do we determine one YARN
> container = one executor in SPARK.
>
> Regards,
> Gourav
>
> On Tue, Oct 9, 2018 at 9:04 PM Dillon Dukek
>  wrote:
>
>> I'm still not sure exactly what you are meaning by saying that you have 6
>> yarn containers. Yarn should just be aware of the total available resources
>> in  your cluster and then be able to launch containers based on the
>> executor requirements you set when you submit your job. If you can, I think
>> it would be helpful to send me the command you're using to launch your
>> spark process. You should also be able to use the logs and/or the spark UI
>> to determine how many executors are running.
>>
>> On Tue, Oct 9, 2018 at 12:57 PM Gourav Sengupta <
>> gourav.sengu...@gmail.com> wrote:
>>
>>> hi,
>>>
>>> may be I am not quite clear in my head on this one. But how do we know
>>> that 1 yarn container = 1 executor?
>>>
>>> Regards,
>>> Gourav Sengupta
>>>
>>> On Tue, Oct 9, 2018 at 8:53 PM Dillon Dukek
>>>  wrote:
>>>
 Can you send how you are launching your streaming process? Also what
 environment is this cluster running in (EMR, GCP, self managed, etc)?

 On Tue, Oct 9, 2018 at 10:21 AM kant kodali  wrote:

> Hi All,
>
> I am using Spark 2.3.1 and using YARN as a cluster manager.
>
> I currently got
>
> 1) 6 YARN containers(executors=6) with 4 executor cores for each
> container.
> 2) 6 Kafka partitions from one topic.
> 3) You can assume every other configuration is set to whatever the
> default values are.
>
> Spawned a Simple Streaming Query and I see all the tasks get scheduled
> on one YARN container. am I missing any config?
>
> Thanks!
>

Re: Spark on YARN not utilizing all the YARN containers available

2018-10-09 Thread Gourav Sengupta

Hi Dillon,

I do think that there is a setting available where in once YARN sets up the
containers then you do not deallocate them, I had used it previously in
HIVE, and it just saves processing time in terms of allocating containers.
That said I am still trying to understand how do we determine one YARN
container = one executor in SPARK.

Regards,
Gourav

On Tue, Oct 9, 2018 at 9:04 PM Dillon Dukek 
wrote:

> I'm still not sure exactly what you are meaning by saying that you have 6
> yarn containers. Yarn should just be aware of the total available resources
> in  your cluster and then be able to launch containers based on the
> executor requirements you set when you submit your job. If you can, I think
> it would be helpful to send me the command you're using to launch your
> spark process. You should also be able to use the logs and/or the spark UI
> to determine how many executors are running.
>
> On Tue, Oct 9, 2018 at 12:57 PM Gourav Sengupta 
> wrote:
>
>> hi,
>>
>> may be I am not quite clear in my head on this one. But how do we know
>> that 1 yarn container = 1 executor?
>>
>> Regards,
>> Gourav Sengupta
>>
>> On Tue, Oct 9, 2018 at 8:53 PM Dillon Dukek
>>  wrote:
>>
>>> Can you send how you are launching your streaming process? Also what
>>> environment is this cluster running in (EMR, GCP, self managed, etc)?
>>>
>>> On Tue, Oct 9, 2018 at 10:21 AM kant kodali  wrote:
>>>
 Hi All,

 I am using Spark 2.3.1 and using YARN as a cluster manager.

 I currently got

 1) 6 YARN containers(executors=6) with 4 executor cores for each
 container.
 2) 6 Kafka partitions from one topic.
 3) You can assume every other configuration is set to whatever the
 default values are.

 Spawned a Simple Streaming Query and I see all the tasks get scheduled
 on one YARN container. am I missing any config?

 Thanks!

>>>

Re: Spark on YARN not utilizing all the YARN containers available

2018-10-09 Thread Dillon Dukek

I'm still not sure exactly what you are meaning by saying that you have 6
yarn containers. Yarn should just be aware of the total available resources
in  your cluster and then be able to launch containers based on the
executor requirements you set when you submit your job. If you can, I think
it would be helpful to send me the command you're using to launch your
spark process. You should also be able to use the logs and/or the spark UI
to determine how many executors are running.

On Tue, Oct 9, 2018 at 12:57 PM Gourav Sengupta 
wrote:

> hi,
>
> may be I am not quite clear in my head on this one. But how do we know
> that 1 yarn container = 1 executor?
>
> Regards,
> Gourav Sengupta
>
> On Tue, Oct 9, 2018 at 8:53 PM Dillon Dukek
>  wrote:
>
>> Can you send how you are launching your streaming process? Also what
>> environment is this cluster running in (EMR, GCP, self managed, etc)?
>>
>> On Tue, Oct 9, 2018 at 10:21 AM kant kodali  wrote:
>>
>>> Hi All,
>>>
>>> I am using Spark 2.3.1 and using YARN as a cluster manager.
>>>
>>> I currently got
>>>
>>> 1) 6 YARN containers(executors=6) with 4 executor cores for each
>>> container.
>>> 2) 6 Kafka partitions from one topic.
>>> 3) You can assume every other configuration is set to whatever the
>>> default values are.
>>>
>>> Spawned a Simple Streaming Query and I see all the tasks get scheduled
>>> on one YARN container. am I missing any config?
>>>
>>> Thanks!
>>>
>>

Re: Spark on YARN not utilizing all the YARN containers available

2018-10-09 Thread Gourav Sengupta

hi,

may be I am not quite clear in my head on this one. But how do we know that
1 yarn container = 1 executor?

Regards,
Gourav Sengupta

On Tue, Oct 9, 2018 at 8:53 PM Dillon Dukek 
wrote:

> Can you send how you are launching your streaming process? Also what
> environment is this cluster running in (EMR, GCP, self managed, etc)?
>
> On Tue, Oct 9, 2018 at 10:21 AM kant kodali  wrote:
>
>> Hi All,
>>
>> I am using Spark 2.3.1 and using YARN as a cluster manager.
>>
>> I currently got
>>
>> 1) 6 YARN containers(executors=6) with 4 executor cores for each
>> container.
>> 2) 6 Kafka partitions from one topic.
>> 3) You can assume every other configuration is set to whatever the
>> default values are.
>>
>> Spawned a Simple Streaming Query and I see all the tasks get scheduled on
>> one YARN container. am I missing any config?
>>
>> Thanks!
>>
>

Re: Spark on YARN not utilizing all the YARN containers available

2018-10-09 Thread Dillon Dukek

Can you send how you are launching your streaming process? Also what
environment is this cluster running in (EMR, GCP, self managed, etc)?

On Tue, Oct 9, 2018 at 10:21 AM kant kodali  wrote:

> Hi All,
>
> I am using Spark 2.3.1 and using YARN as a cluster manager.
>
> I currently got
>
> 1) 6 YARN containers(executors=6) with 4 executor cores for each
> container.
> 2) 6 Kafka partitions from one topic.
> 3) You can assume every other configuration is set to whatever the default
> values are.
>
> Spawned a Simple Streaming Query and I see all the tasks get scheduled on
> one YARN container. am I missing any config?
>
> Thanks!
>

Re: Spark on YARN in client-mode: do we need 1 vCore for the AM?

2018-05-24 Thread Jeff Zhang

I don't think it is possible to have less than 1 core for AM, this is due
to yarn not spark.

The number of AM comparing to the number of executors should be small and
acceptable. If you do want to save more resources, I would suggest you to
use yarn cluster mode where driver and AM run in the same process.

You can either use livy or zeppelin which both support interactive work in
yarn cluster mode.

http://livy.incubator.apache.org/
https://zeppelin.apache.org/
https://medium.com/@zjffdu/zeppelin-0-8-0-new-features-ea53e8810235


Another approach to save resources is to share SparkContext across your
applications since your scenario is interactive work ( I guess it is some
kind of notebook).  Zeppelin support sharing SparkContext across users and
notes.



peay 于2018年5月18日周五 下午6:20写道：

> Hello,
>
> I run a Spark cluster on YARN, and we have a bunch of client-mode
> applications we use for interactive work. Whenever we start one of this, an
> application master container is started.
>
> My understanding is that this is mostly an empty shell, used to request
> further containers or get status from YARN. Is that correct?
>
> spark.yarn.am.cores is 1, and that AM gets one full vCore on the cluster.
> Because I am using DominantResourceCalculator to take vCores into account
> for scheduling, this results in a lot of unused CPU capacity overall
> because all those AMs each block one full vCore. With enough jobs, this
> adds up quickly.
>
> I am trying to understand if we can work around that -- ideally, by
> allocating fractional vCores (e.g., give 100 millicores to the AM), or by
> allocating no vCores at all for the AM (I am fine with a bit of
> oversubscription because of that).
>
> Any idea on how to avoid blocking so many YARN vCores just for the Spark
> AMs?
>
> Thanks!
>
>

Re: spark on yarn can't load kafka dependency jar

2016-12-15 Thread Mich Talebzadeh

try this it should work and yes they are comma separated

spark-streaming-kafka_2.10-1.5.1.jar

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 15 December 2016 at 22:49, neil90  wrote:

> Don't the jars need to be comma sperated when you pass?
>
> i.e. --jars "hdfs://zzz:8020/jars/kafka_2.10-0.8.2.2.jar",
> /opt/bigdevProject/sparkStreaming_jar4/sparkStreaming.jar
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/spark-on-yarn-can-t-load-kafka-
> dependency-jar-tp28216p28220.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: spark on yarn can't load kafka dependency jar

2016-12-15 Thread neil90

Don't the jars need to be comma sperated when you pass?

i.e. --jars "hdfs://zzz:8020/jars/kafka_2.10-0.8.2.2.jar",
/opt/bigdevProject/sparkStreaming_jar4/sparkStreaming.jar 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-on-yarn-can-t-load-kafka-dependency-jar-tp28216p28220.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark on yarn enviroment var

2016-10-01 Thread Vadim Semenov

The question should be addressed to the oozie community.

As far as I remember, a spark action doesn't have support of env variables.

On Fri, Sep 30, 2016 at 8:11 PM, Saurabh Malviya (samalviy) <
samal...@cisco.com> wrote:

> Hi,
>
>
>
> I am running spark on yarn using oozie.
>
>
>
> When submit through command line using spark-submit spark is able to read
> env variable.  But while submit through oozie its not able toget env
> variable and don’t see driver log.
>
>
>
> Is there any way we specify env variable in oozie spark action.
>
>
>
> Saurabh
>

Re: Spark on yarn, only 1 or 2 vcores getting allocated to the containers getting created.

2016-08-03 Thread Mungeol Heo

Try to turn yarn.scheduler.capacity.resource-calculator on, then check again.

On Wed, Aug 3, 2016 at 4:53 PM, Saisai Shao  wrote:
> Use dominant resource calculator instead of default resource calculator will
> get the expected vcores as you wanted. Basically by default yarn does not
> honor cpu cores as resource, so you will always see vcore is 1 no matter
> what number of cores you set in spark.
>
> On Wed, Aug 3, 2016 at 12:11 PM, satyajit vegesna
>  wrote:
>>
>> Hi All,
>>
>> I am trying to run a spark job using yarn, and i specify --executor-cores
>> value as 20.
>> But when i go check the "nodes of the cluster" page in
>> http://hostname:8088/cluster/nodes then i see 4 containers getting created
>> on each of the node in cluster.
>>
>> But can only see 1 vcore getting assigned for each containier, even when i
>> specify --executor-cores 20 while submitting job using spark-submit.
>>
>> yarn-site.xml
>> 
>> yarn.scheduler.maximum-allocation-mb
>> 6
>> 
>> 
>> yarn.scheduler.minimum-allocation-vcores
>> 1
>> 
>> 
>> yarn.scheduler.maximum-allocation-vcores
>> 40
>> 
>> 
>> yarn.nodemanager.resource.memory-mb
>> 7
>> 
>> 
>> yarn.nodemanager.resource.cpu-vcores
>> 20
>> 
>>
>>
>> Did anyone face the same issue??
>>
>> Regards,
>> Satyajit.
>
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark on yarn, only 1 or 2 vcores getting allocated to the containers getting created.

2016-08-03 Thread Mungeol Heo

Try to turn "yarn.scheduler.capacity.resource-calculator" on

On Wed, Aug 3, 2016 at 4:53 PM, Saisai Shao  wrote:
> Use dominant resource calculator instead of default resource calculator will
> get the expected vcores as you wanted. Basically by default yarn does not
> honor cpu cores as resource, so you will always see vcore is 1 no matter
> what number of cores you set in spark.
>
> On Wed, Aug 3, 2016 at 12:11 PM, satyajit vegesna
>  wrote:
>>
>> Hi All,
>>
>> I am trying to run a spark job using yarn, and i specify --executor-cores
>> value as 20.
>> But when i go check the "nodes of the cluster" page in
>> http://hostname:8088/cluster/nodes then i see 4 containers getting created
>> on each of the node in cluster.
>>
>> But can only see 1 vcore getting assigned for each containier, even when i
>> specify --executor-cores 20 while submitting job using spark-submit.
>>
>> yarn-site.xml
>> 
>> yarn.scheduler.maximum-allocation-mb
>> 6
>> 
>> 
>> yarn.scheduler.minimum-allocation-vcores
>> 1
>> 
>> 
>> yarn.scheduler.maximum-allocation-vcores
>> 40
>> 
>> 
>> yarn.nodemanager.resource.memory-mb
>> 7
>> 
>> 
>> yarn.nodemanager.resource.cpu-vcores
>> 20
>> 
>>
>>
>> Did anyone face the same issue??
>>
>> Regards,
>> Satyajit.
>
>

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Spark on yarn, only 1 or 2 vcores getting allocated to the containers getting created.

2016-08-03 Thread Saisai Shao

Use dominant resource calculator instead of default resource calculator
will get the expected vcores as you wanted. Basically by default yarn does
not honor cpu cores as resource, so you will always see vcore is 1 no
matter what number of cores you set in spark.

On Wed, Aug 3, 2016 at 12:11 PM, satyajit vegesna <
satyajit.apas...@gmail.com> wrote:

> Hi All,
>
> I am trying to run a spark job using yarn, and i specify --executor-cores
> value as 20.
> But when i go check the "nodes of the cluster" page in
> http://hostname:8088/cluster/nodes then i see 4 containers getting
> created on each of the node in cluster.
>
> But can only see 1 vcore getting assigned for each containier, even when i
> specify --executor-cores 20 while submitting job using spark-submit.
>
> yarn-site.xml
> 
> yarn.scheduler.maximum-allocation-mb
> 6
> 
> 
> yarn.scheduler.minimum-allocation-vcores
> 1
> 
> 
> yarn.scheduler.maximum-allocation-vcores
> 40
> 
> 
> yarn.nodemanager.resource.memory-mb
> 7
> 
> 
> yarn.nodemanager.resource.cpu-vcores
> 20
> 
>
>
> Did anyone face the same issue??
>
> Regards,
> Satyajit.
>

Re: spark on yarn

2016-05-26 Thread Steve Loughran

> On 21 May 2016, at 15:14, Shushant Arora  wrote:
> 
> And will it allocate rest executors when other containers get freed which 
> were occupied by other hadoop jobs/spark applications?
> 

requests will go into the queue(s), they'll stay outstanding until things free 
up *or more machines join the cluster*. Whoever is in the higher priority queue 
gets that free capacity

you can also play with pre-emption, in which low priority work can get killed 
without warking

> And is there any minimum (% of executors demanded vs available) executors it 
> wait for to be freed or just start with even 1 .
> 

that's called "gang scheduling", and no, it's not in YARN. Tricky one as it can 
complicate allocation and can result in either things never getting scheduled 
or >1 app having incompletely allocated containers and, while the capacity is 
enough for one app, if the resources are assigned over both, neither can start.

look at YARN-896 to see the big todo list for services

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: spark on yarn

2016-05-21 Thread Shushant Arora

3.And is the same behavior applied to streaming application also?

On Sat, May 21, 2016 at 7:44 PM, Shushant Arora 
wrote:

> And will it allocate rest executors when other containers get freed which
> were occupied by other hadoop jobs/spark applications?
>
> And is there any minimum (% of executors demanded vs available) executors
> it wait for to be freed or just start with even 1 .
>
> Thanks!
>
> On Thu, Apr 21, 2016 at 8:39 PM, Steve Loughran 
> wrote:
>
>> If there isn't enough space in your cluster for all the executors you
>> asked for to be created, Spark will only get the ones which can be
>> allocated. It will start work without waiting for the others to arrive.
>>
>> Make sure you ask for enough memory: YARN is a lot more unforgiving about
>> memory use than it is about CPU
>>
>> > On 20 Apr 2016, at 16:21, Shushant Arora 
>> wrote:
>> >
>> > I am running a spark application on yarn cluster.
>> >
>> > say I have available vcors in cluster as 100.And I start spark
>> application with --num-executors 200 --num-cores 2 (so I need total
>> 200*2=400 vcores) but in my cluster only 100 are available.
>> >
>> > What will happen ? Will the job abort or it will be submitted
>> successfully and 100 vcores will be aallocated to 50 executors and rest
>> executors will be started as soon as vcores are available ?
>> >
>> > Please note dynamic allocation is not enabled in cluster. I have old
>> version 1.2.
>> >
>> > Thanks
>> >
>>
>>
>

Re: spark on yarn

2016-05-21 Thread Shushant Arora

And will it allocate rest executors when other containers get freed which
were occupied by other hadoop jobs/spark applications?

And is there any minimum (% of executors demanded vs available) executors
it wait for to be freed or just start with even 1 .

Thanks!

On Thu, Apr 21, 2016 at 8:39 PM, Steve Loughran 
wrote:

> If there isn't enough space in your cluster for all the executors you
> asked for to be created, Spark will only get the ones which can be
> allocated. It will start work without waiting for the others to arrive.
>
> Make sure you ask for enough memory: YARN is a lot more unforgiving about
> memory use than it is about CPU
>
> > On 20 Apr 2016, at 16:21, Shushant Arora 
> wrote:
> >
> > I am running a spark application on yarn cluster.
> >
> > say I have available vcors in cluster as 100.And I start spark
> application with --num-executors 200 --num-cores 2 (so I need total
> 200*2=400 vcores) but in my cluster only 100 are available.
> >
> > What will happen ? Will the job abort or it will be submitted
> successfully and 100 vcores will be aallocated to 50 executors and rest
> executors will be started as soon as vcores are available ?
> >
> > Please note dynamic allocation is not enabled in cluster. I have old
> version 1.2.
> >
> > Thanks
> >
>
>

Re: spark on yarn

2016-04-21 Thread Steve Loughran

If there isn't enough space in your cluster for all the executors you asked for 
to be created, Spark will only get the ones which can be allocated. It will 
start work without waiting for the others to arrive.

Make sure you ask for enough memory: YARN is a lot more unforgiving about 
memory use than it is about CPU

> On 20 Apr 2016, at 16:21, Shushant Arora  wrote:
> 
> I am running a spark application on yarn cluster.
> 
> say I have available vcors in cluster as 100.And I start spark application 
> with --num-executors 200 --num-cores 2 (so I need total 200*2=400 vcores) but 
> in my cluster only 100 are available.
> 
> What will happen ? Will the job abort or it will be submitted successfully 
> and 100 vcores will be aallocated to 50 executors and rest executors will be 
> started as soon as vcores are available ?
> 
> Please note dynamic allocation is not enabled in cluster. I have old version 
> 1.2.
> 
> Thanks
> 


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: spark on yarn

2016-04-20 Thread Mail.com

I get an error with a message that state what is max number of cores allowed.


> On Apr 20, 2016, at 11:21 AM, Shushant Arora  
> wrote:
> 
> I am running a spark application on yarn cluster.
> 
> say I have available vcors in cluster as 100.And I start spark application 
> with --num-executors 200 --num-cores 2 (so I need total 200*2=400 vcores) but 
> in my cluster only 100 are available.
> 
> What will happen ? Will the job abort or it will be submitted successfully 
> and 100 vcores will be aallocated to 50 executors and rest executors will be 
> started as soon as vcores are available ?
> 
> Please note dynamic allocation is not enabled in cluster. I have old version 
> 1.2.
> 
> Thanks
> 

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark with Yarn Client

2016-03-11 Thread Alexander Pivovarov

Check doc - http://spark.apache.org/docs/latest/running-on-yarn.html

also you can start EMR-4.2.0 or 4.3.0 cluster with Spark app and see how
it's configured

On Fri, Mar 11, 2016 at 7:50 PM, Divya Gehlot 
wrote:

> Hi,
> I am trying to understand behaviour /configuration of spark with yarn
> client on hadoop cluster .
> Can somebody help me or point me document /blog/books which has deeper
> understanding of above two.
> Thanks,
> Divya
>

Re: Spark on YARN memory consumption

2016-03-11 Thread Jan Štěrba

Thanks that explains a lot.
--
Jan Sterba
https://twitter.com/honzasterba | http://flickr.com/honzasterba |
http://500px.com/honzasterba


On Fri, Mar 11, 2016 at 2:36 PM, Silvio Fiorito
 wrote:
> Hi Jan,
>
>
>
> Yes what you’re seeing is due to YARN container memory overhead. Also,
> typically the memory increments for YARN containers is 1GB.
>
>
>
> This gives a good overview:
> http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/
>
>
>
> Thanks,
>
> Silvio
>
>
>
>
>
>
>
> From: Jan Štěrba
> Sent: Friday, March 11, 2016 8:27 AM
> To: User
> Subject: Spark on YARN memory consumption
>
>
>
> Hello,
>
> I am exprimenting with tuning an on demand spark-cluster on top of our
> cloudera hadoop. I am running Cloudera 5.5.2 with Spark 1.5 right now
> and I am running spark in yarn-client mode.
>
> Right now my main experimentation is about spark.executor.memory
> property and I have noticed a strange behaviour.
>
> When I set spark.executor.memory=512M several things happen:
> - per each executor a container with 1GB memory is requested and
> assigned from YARN
> - in Spark UI I can see that each executor has 256M memory
>
> So what I am seeing is that spark requests 2x the memory but the
> executor has only 1/4 of what has been requested. Why is that?
>
> Thanks.
>
> --
> Jan Sterba
> https://twitter.com/honzasterba | http://flickr.com/honzasterba |
> http://500px.com/honzasterba
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

RE: Spark on YARN memory consumption

2016-03-11 Thread Silvio Fiorito

Hi Jan,



Yes what you’re seeing is due to YARN container memory overhead. Also, 
typically the memory increments for YARN containers is 1GB.



This gives a good overview: 
http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/



Thanks,

Silvio







From: Jan Štěrba
Sent: Friday, March 11, 2016 8:27 AM
To: User
Subject: Spark on YARN memory consumption



Hello,

I am exprimenting with tuning an on demand spark-cluster on top of our
cloudera hadoop. I am running Cloudera 5.5.2 with Spark 1.5 right now
and I am running spark in yarn-client mode.

Right now my main experimentation is about spark.executor.memory
property and I have noticed a strange behaviour.

When I set spark.executor.memory=512M several things happen:
- per each executor a container with 1GB memory is requested and
assigned from YARN
- in Spark UI I can see that each executor has 256M memory

So what I am seeing is that spark requests 2x the memory but the
executor has only 1/4 of what has been requested. Why is that?

Thanks.

--
Jan Sterba
https://twitter.com/honzasterba | http://flickr.com/honzasterba |
http://500px.com/honzasterba

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark on Yarn with Dynamic Resource Allocation. Container always marked as failed

2016-03-02 Thread Xiaoye Sun

Hi Jeff and Prabhu,

Thanks for your help.

I look deep in the nodemanager log and I found that I have a error message
like this:
2016-03-02 03:13:59,692 ERROR
org.apache.spark.network.shuffle.ExternalShuffleBlockResolver: error
opening leveldb file
file:/data/yarn/cache/yarn/nm-local-dir/registeredExecutors.ldb
.
Creating new file, will not be able to recover state for existing
applications

This error message is also reported in the following jira ticket.
https://issues.apache.org/jira/browse/SPARK-13622

I reason for this problem is that in core-site.xml, I set hadoop.tmp.dir as
follows:

 hadoop.tmp.dir
 file:/home/xs6/hadoop-2.7.1/tmp


I solve the problem by remove "file:" from the value fields.

Thanks!

Xiaoye


On Wed, Mar 2, 2016 at 10:02 PM, Prabhu Joseph 
wrote:

> Is all NodeManager services restarted after the change in yarn-site.xml
>
> On Thu, Mar 3, 2016 at 6:00 AM, Jeff Zhang  wrote:
>
>> The executor may fail to start. You need to check the executor logs, if
>> there's no executor log then you need to check node manager log.
>>
>> On Wed, Mar 2, 2016 at 4:26 PM, Xiaoye Sun  wrote:
>>
>>> Hi all,
>>>
>>> I am very new to spark and yarn.
>>>
>>> I am running a BroadcastTest example application using spark 1.6.0 and
>>> Hadoop/Yarn 2.7.1. in a 5 nodes cluster.
>>>
>>> I configured my configuration files according to
>>> https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation
>>>
>>> 1. copy
>>> ./spark-1.6.0/network/yarn/target/scala-2.10/spark-1.6.0-yarn-shuffle.jar
>>> to /hadoop-2.7.1/share/hadoop/yarn/lib/
>>> 2. yarn-site.xml is like this
>>> http://www.owlnet.rice.edu/~xs6/yarn-site.xml
>>> 3. spark-defaults.conf is like this
>>> http://www.owlnet.rice.edu/~xs6/spark-defaults.conf
>>> 4. spark-env.sh is like this
>>> http://www.owlnet.rice.edu/~xs6/spark-env.sh
>>> 5. the command I use to submit spark application is: ./bin/spark-submit
>>> --class org.apache.spark.examples.BroadcastTest --master yarn --deploy-mode
>>> cluster ./examples/target/spark-examples_2.10-1.6.0.jar 1 1000 Http
>>>
>>> However, the job is stuck at RUNNING status, and by looking at the log,
>>> I found that the executor is failed/cancelled frequently...
>>> Here is the log output http://www.owlnet.rice.edu/~xs6/stderr
>>> It shows something like
>>>
>>> 16/03/02 02:07:35 WARN yarn.YarnAllocator: Container marked as failed: 
>>> container_1456905762620_0002_01_02 on host: bold-x.rice.edu. Exit 
>>> status: 1. Diagnostics: Exception from container-launch.
>>>
>>>
>>> Is there anybody know what is the problem here?
>>> Best,
>>> Xiaoye
>>>
>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>
>

Re: Spark on Yarn with Dynamic Resource Allocation. Container always marked as failed

2016-03-02 Thread Prabhu Joseph

Is all NodeManager services restarted after the change in yarn-site.xml

On Thu, Mar 3, 2016 at 6:00 AM, Jeff Zhang  wrote:

> The executor may fail to start. You need to check the executor logs, if
> there's no executor log then you need to check node manager log.
>
> On Wed, Mar 2, 2016 at 4:26 PM, Xiaoye Sun  wrote:
>
>> Hi all,
>>
>> I am very new to spark and yarn.
>>
>> I am running a BroadcastTest example application using spark 1.6.0 and
>> Hadoop/Yarn 2.7.1. in a 5 nodes cluster.
>>
>> I configured my configuration files according to
>> https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation
>>
>> 1. copy
>> ./spark-1.6.0/network/yarn/target/scala-2.10/spark-1.6.0-yarn-shuffle.jar
>> to /hadoop-2.7.1/share/hadoop/yarn/lib/
>> 2. yarn-site.xml is like this
>> http://www.owlnet.rice.edu/~xs6/yarn-site.xml
>> 3. spark-defaults.conf is like this
>> http://www.owlnet.rice.edu/~xs6/spark-defaults.conf
>> 4. spark-env.sh is like this http://www.owlnet.rice.edu/~xs6/spark-env.sh
>> 5. the command I use to submit spark application is: ./bin/spark-submit
>> --class org.apache.spark.examples.BroadcastTest --master yarn --deploy-mode
>> cluster ./examples/target/spark-examples_2.10-1.6.0.jar 1 1000 Http
>>
>> However, the job is stuck at RUNNING status, and by looking at the log, I
>> found that the executor is failed/cancelled frequently...
>> Here is the log output http://www.owlnet.rice.edu/~xs6/stderr
>> It shows something like
>>
>> 16/03/02 02:07:35 WARN yarn.YarnAllocator: Container marked as failed: 
>> container_1456905762620_0002_01_02 on host: bold-x.rice.edu. Exit 
>> status: 1. Diagnostics: Exception from container-launch.
>>
>>
>> Is there anybody know what is the problem here?
>> Best,
>> Xiaoye
>>
>
>
>
> --
> Best Regards
>
> Jeff Zhang
>

Re: Spark on Yarn with Dynamic Resource Allocation. Container always marked as failed

2016-03-02 Thread Jeff Zhang

The executor may fail to start. You need to check the executor logs, if
there's no executor log then you need to check node manager log.

On Wed, Mar 2, 2016 at 4:26 PM, Xiaoye Sun  wrote:

> Hi all,
>
> I am very new to spark and yarn.
>
> I am running a BroadcastTest example application using spark 1.6.0 and
> Hadoop/Yarn 2.7.1. in a 5 nodes cluster.
>
> I configured my configuration files according to
> https://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation
>
> 1. copy
> ./spark-1.6.0/network/yarn/target/scala-2.10/spark-1.6.0-yarn-shuffle.jar
> to /hadoop-2.7.1/share/hadoop/yarn/lib/
> 2. yarn-site.xml is like this
> http://www.owlnet.rice.edu/~xs6/yarn-site.xml
> 3. spark-defaults.conf is like this
> http://www.owlnet.rice.edu/~xs6/spark-defaults.conf
> 4. spark-env.sh is like this http://www.owlnet.rice.edu/~xs6/spark-env.sh
> 5. the command I use to submit spark application is: ./bin/spark-submit
> --class org.apache.spark.examples.BroadcastTest --master yarn --deploy-mode
> cluster ./examples/target/spark-examples_2.10-1.6.0.jar 1 1000 Http
>
> However, the job is stuck at RUNNING status, and by looking at the log, I
> found that the executor is failed/cancelled frequently...
> Here is the log output http://www.owlnet.rice.edu/~xs6/stderr
> It shows something like
>
> 16/03/02 02:07:35 WARN yarn.YarnAllocator: Container marked as failed: 
> container_1456905762620_0002_01_02 on host: bold-x.rice.edu. Exit status: 
> 1. Diagnostics: Exception from container-launch.
>
>
> Is there anybody know what is the problem here?
> Best,
> Xiaoye
>



-- 
Best Regards

Jeff Zhang

Re: Spark 1.5.2 Yarn Application Master - resiliencey

2016-02-03 Thread Nirav Patel

Awesome! it looks promising. Thanks Rishabh and Marcelo.

On Wed, Feb 3, 2016 at 12:09 PM, Rishabh Wadhawan 
wrote:

> Check out this link
> http://spark.apache.org/docs/latest/configuration.html and check
> spark.shuffle.service. Thanks
>
> On Feb 3, 2016, at 1:02 PM, Marcelo Vanzin  wrote:
>
> Yes, but you don't necessarily need to use dynamic allocation (just enable
> the external shuffle service).
>
> On Wed, Feb 3, 2016 at 11:53 AM, Nirav Patel 
> wrote:
>
>> Do you mean this setup?
>>
>> https://spark.apache.org/docs/1.5.2/job-scheduling.html#dynamic-resource-allocation
>>
>>
>>
>> On Wed, Feb 3, 2016 at 11:50 AM, Marcelo Vanzin 
>> wrote:
>>
>>> Without the exact error from the driver that caused the job to restart,
>>> it's hard to tell. But a simple way to improve things is to install the
>>> Spark shuffle service on the YARN nodes, so that even if an executor
>>> crashes, its shuffle output is still available to other executors.
>>>
>>> On Wed, Feb 3, 2016 at 11:46 AM, Nirav Patel 
>>> wrote:
>>>
 Hi,

 I have a spark job running on yarn-client mode. At some point during
 Join stage, executor(container) runs out of memory and yarn kills it. Due
 to this Entire job restarts! and it keeps doing it on every failure?

 What is the best way to checkpoint? I see there's checkpoint api and
 other option might be to persist before Join stage. Would that prevent
 retry of entire job? How about just retrying only the task that was
 distributed to that faulty executor?

 Thanks



 [image: What's New with Xactly]
 

   [image: LinkedIn]
   [image: Twitter]
   [image: Facebook]
   [image: YouTube]
 
>>>
>>>
>>>
>>>
>>> --
>>> Marcelo
>>>
>>
>>
>>
>>
>> [image: What's New with Xactly] 
>>
>>   [image: LinkedIn]
>>   [image: Twitter]
>>   [image: Facebook]
>>   [image: YouTube]
>> 
>>
>
>
>
> --
> Marcelo
>
>
>

-- 


[image: What's New with Xactly] 

  [image: LinkedIn] 
  [image: Twitter] 
  [image: Facebook] 
  [image: YouTube]

Re: Spark 1.5.2 Yarn Application Master - resiliencey

2016-02-03 Thread Marcelo Vanzin

Without the exact error from the driver that caused the job to restart,
it's hard to tell. But a simple way to improve things is to install the
Spark shuffle service on the YARN nodes, so that even if an executor
crashes, its shuffle output is still available to other executors.

On Wed, Feb 3, 2016 at 11:46 AM, Nirav Patel  wrote:

> Hi,
>
> I have a spark job running on yarn-client mode. At some point during Join
> stage, executor(container) runs out of memory and yarn kills it. Due to
> this Entire job restarts! and it keeps doing it on every failure?
>
> What is the best way to checkpoint? I see there's checkpoint api and other
> option might be to persist before Join stage. Would that prevent retry of
> entire job? How about just retrying only the task that was distributed to
> that faulty executor?
>
> Thanks
>
>
>
> [image: What's New with Xactly] 
>
>   [image: LinkedIn]
>   [image: Twitter]
>   [image: Facebook]
>   [image: YouTube]
> 

-- 
Marcelo

Re: Spark 1.5.2 Yarn Application Master - resiliencey

2016-02-03 Thread Nirav Patel

Do you mean this setup?
https://spark.apache.org/docs/1.5.2/job-scheduling.html#dynamic-resource-allocation



On Wed, Feb 3, 2016 at 11:50 AM, Marcelo Vanzin  wrote:

> Without the exact error from the driver that caused the job to restart,
> it's hard to tell. But a simple way to improve things is to install the
> Spark shuffle service on the YARN nodes, so that even if an executor
> crashes, its shuffle output is still available to other executors.
>
> On Wed, Feb 3, 2016 at 11:46 AM, Nirav Patel 
> wrote:
>
>> Hi,
>>
>> I have a spark job running on yarn-client mode. At some point during Join
>> stage, executor(container) runs out of memory and yarn kills it. Due to
>> this Entire job restarts! and it keeps doing it on every failure?
>>
>> What is the best way to checkpoint? I see there's checkpoint api and
>> other option might be to persist before Join stage. Would that prevent
>> retry of entire job? How about just retrying only the task that was
>> distributed to that faulty executor?
>>
>> Thanks
>>
>>
>>
>> [image: What's New with Xactly] 
>>
>>   [image: LinkedIn]
>>   [image: Twitter]
>>   [image: Facebook]
>>   [image: YouTube]
>> 
>
>
>
>
> --
> Marcelo
>

-- 


[image: What's New with Xactly] 

  [image: LinkedIn] 
  [image: Twitter] 
  [image: Facebook] 
  [image: YouTube]

Re: Spark 1.5.2 Yarn Application Master - resiliencey

2016-02-03 Thread Marcelo Vanzin

Yes, but you don't necessarily need to use dynamic allocation (just enable
the external shuffle service).

On Wed, Feb 3, 2016 at 11:53 AM, Nirav Patel  wrote:

> Do you mean this setup?
>
> https://spark.apache.org/docs/1.5.2/job-scheduling.html#dynamic-resource-allocation
>
>
>
> On Wed, Feb 3, 2016 at 11:50 AM, Marcelo Vanzin 
> wrote:
>
>> Without the exact error from the driver that caused the job to restart,
>> it's hard to tell. But a simple way to improve things is to install the
>> Spark shuffle service on the YARN nodes, so that even if an executor
>> crashes, its shuffle output is still available to other executors.
>>
>> On Wed, Feb 3, 2016 at 11:46 AM, Nirav Patel 
>> wrote:
>>
>>> Hi,
>>>
>>> I have a spark job running on yarn-client mode. At some point during
>>> Join stage, executor(container) runs out of memory and yarn kills it. Due
>>> to this Entire job restarts! and it keeps doing it on every failure?
>>>
>>> What is the best way to checkpoint? I see there's checkpoint api and
>>> other option might be to persist before Join stage. Would that prevent
>>> retry of entire job? How about just retrying only the task that was
>>> distributed to that faulty executor?
>>>
>>> Thanks
>>>
>>>
>>>
>>> [image: What's New with Xactly] 
>>>
>>>   [image: LinkedIn]
>>>   [image: Twitter]
>>>   [image: Facebook]
>>>   [image: YouTube]
>>> 
>>
>>
>>
>>
>> --
>> Marcelo
>>
>
>
>
>
> [image: What's New with Xactly] 
>
>   [image: LinkedIn]
>   [image: Twitter]
>   [image: Facebook]
>   [image: YouTube]
> 
>



-- 
Marcelo

Re: Spark 1.5.2 Yarn Application Master - resiliencey

2016-02-03 Thread Rishabh Wadhawan

Hi Nirav
There is a difference between dynamic resource allocation and shuffle service. 
The dynamic allocation when you enable the configurations for it, every time 
you run any task spark will determine the number of executors required to run 
that task for you, which means decreasing the executors when task is simple and 
bumping more executors when task is complex. However, shuffle service would 
basically transfer the intermediate state during any transformation or a task 
execution to another executor if the current executor dies during the process. 
So even one of your executor dies the other active executor could take the 
intermediate state and start executing the process. 
> On Feb 3, 2016, at 1:02 PM, Marcelo Vanzin  wrote:
> 
> Yes, but you don't necessarily need to use dynamic allocation (just enable 
> the external shuffle service).
> 
> On Wed, Feb 3, 2016 at 11:53 AM, Nirav Patel  > wrote:
> Do you mean this setup?
> https://spark.apache.org/docs/1.5.2/job-scheduling.html#dynamic-resource-allocation
>  
> 
> 
> 
> 
> On Wed, Feb 3, 2016 at 11:50 AM, Marcelo Vanzin  > wrote:
> Without the exact error from the driver that caused the job to restart, it's 
> hard to tell. But a simple way to improve things is to install the Spark 
> shuffle service on the YARN nodes, so that even if an executor crashes, its 
> shuffle output is still available to other executors.
> 
> On Wed, Feb 3, 2016 at 11:46 AM, Nirav Patel  > wrote:
> Hi,
> 
> I have a spark job running on yarn-client mode. At some point during Join 
> stage, executor(container) runs out of memory and yarn kills it. Due to this 
> Entire job restarts! and it keeps doing it on every failure?
> 
> What is the best way to checkpoint? I see there's checkpoint api and other 
> option might be to persist before Join stage. Would that prevent retry of 
> entire job? How about just retrying only the task that was distributed to 
> that faulty executor? 
> 
> Thanks
> 
> 
> 
>  
> 
>     
>    
>       
> 
> 
> 
> -- 
> Marcelo
> 
> 
> 
> 
>  
> 
>     
>    
>       
> 
> 
> 
> -- 
> Marcelo

Re: Spark 1.5.2 Yarn Application Master - resiliencey

2016-02-03 Thread Rishabh Wadhawan

Check out this link  http://spark.apache.org/docs/latest/configuration.html 
 and check 
spark.shuffle.service. Thanks
> On Feb 3, 2016, at 1:02 PM, Marcelo Vanzin  wrote:
> 
> Yes, but you don't necessarily need to use dynamic allocation (just enable 
> the external shuffle service).
> 
> On Wed, Feb 3, 2016 at 11:53 AM, Nirav Patel  > wrote:
> Do you mean this setup?
> https://spark.apache.org/docs/1.5.2/job-scheduling.html#dynamic-resource-allocation
>  
> 
> 
> 
> 
> On Wed, Feb 3, 2016 at 11:50 AM, Marcelo Vanzin  > wrote:
> Without the exact error from the driver that caused the job to restart, it's 
> hard to tell. But a simple way to improve things is to install the Spark 
> shuffle service on the YARN nodes, so that even if an executor crashes, its 
> shuffle output is still available to other executors.
> 
> On Wed, Feb 3, 2016 at 11:46 AM, Nirav Patel  > wrote:
> Hi,
> 
> I have a spark job running on yarn-client mode. At some point during Join 
> stage, executor(container) runs out of memory and yarn kills it. Due to this 
> Entire job restarts! and it keeps doing it on every failure?
> 
> What is the best way to checkpoint? I see there's checkpoint api and other 
> option might be to persist before Join stage. Would that prevent retry of 
> entire job? How about just retrying only the task that was distributed to 
> that faulty executor? 
> 
> Thanks
> 
> 
> 
>  
> 
>     
>    
>       
> 
> 
> 
> -- 
> Marcelo
> 
> 
> 
> 
>  
> 
>     
>    
>       
> 
> 
> 
> -- 
> Marcelo

RE: Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file.

2016-01-18 Thread Siddharth Ubale

Hi,

Thanks for pointing out the Phoenix discrepancy! I was using a phoenix 4.4. jar 
for hbase 1.1 release however I was using Hbase 0.98 . Have fixed the above 
issue.
I am still unable to go ahead with the Streaming job in cluster mode with the 
following trace:

Application application_1452763526769_0021 failed 2 times due to AM Container 
for appattempt_1452763526769_0021_02 exited with exitCode: -1000
For more detailed output, check application tracking 
page:http://slave1:8088/proxy/application_1452763526769_0021/Then, click on 
links to logs of each attempt.
Diagnostics: File does not exist: 
hdfs://slave1:9000/user/hduser/.sparkStaging/application_1452763526769_0021/__hadoop_conf__7080838197861423764.zip
java.io.FileNotFoundException: File does not exist: 
hdfs://slave1:9000/user/hduser/.sparkStaging/application_1452763526769_0021/__hadoop_conf__7080838197861423764.zip
at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
at 
org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Failing this attempt. Failing the application.


Not sure what’s this _Hadoop_conf-XXX.zip. Can some one pls guide me .
I am submitting a Spark Streaming job whjic is reading from a Kafka topic and 
dumoing data in to hbase tables via Phoenix API. The job is behaving as 
expected in local mode.


Thanks
Siddharth Ubale

From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Friday, January 15, 2016 8:08 PM
To: Siddharth Ubale <siddharth.ub...@syncoms.com>
Cc: user@spark.apache.org
Subject: Re: Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file.

Interesting. Which hbase / Phoenix releases are you using ?
The following method has been removed from Put:

   public Put setWriteToWAL(boolean write) {

Please make sure the Phoenix release is compatible with your hbase version.

Cheers

On Fri, Jan 15, 2016 at 6:20 AM, Siddharth Ubale 
<siddharth.ub...@syncoms.com<mailto:siddharth.ub...@syncoms.com>> wrote:
Hi,


This is the log from the application :

16/01/15 19:23:19 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster 
with SUCCEEDED (diag message: Shutdown hook called before final status was 
reported.)
16/01/15 19:23:19 INFO yarn.ApplicationMaster: Deleting staging directory 
.sparkStaging/application_1452763526769_0011
16/01/15 19:23:19 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Shutting down remote daemon.
16/01/15 19:23:19 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote 
daemon shut down; proceeding with flushing remote transports.
16/01/15 19:23:19 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Remoting shut down.
16/01/15 19:23:19 INFO util.Utils: Shutdown hook called
16/01/15 19:23:19 INFO client.HConnectionManager$HConnectionImplementation: 
Closing zookeeper sessionid=0x1523f753f6f0061
16/01/15 19:23:19 INFO zookeeper.ClientCnxn: EventThread shut down
16/01/15 19:23:19 INFO zookeeper.ZooKeeper: Session: 0x1523f753f6f0061 closed
16/01/15 19:23:19 ERROR yarn.ApplicationMaster: User class threw exception: 
java.lang.NoSuchMethodError: 
org.apache.hadoop.hbase.client.Put.setWriteToWAL(Z)Lorg/apache/hadoop/hbase/client/Put;
java.lang.NoSuchMethodError: 
org.apache.hadoop.hbase.client.Put.setWriteToWAL(Z)Lorg/apache/hadoop/hbase/client/Put;
at 
org.apache.phoenix.schema.PTableImpl$PRowImpl.newMutations(PTableImpl.java:639)
at 
org.apache.phoenix.schema.PTableImpl$PRowImpl.(PTableImpl.java:632)
at 
org.apache.phoenix.schema.PTableImpl.newRow(PTableImpl.java:557)
at 
org.apache.phoenix.schema.PTableImpl.newRow(PTableImpl.java:573)
at 
org.apache.phoenix.execute.MutationState.addRowMutations(M

Re: Spark 1.6.0, yarn-shuffle

2016-01-18 Thread johd

Hi,

No, i have not. :-/

Regards, J



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-6-0-yarn-shuffle-tp25961p26002.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file.

2016-01-15 Thread Ted Yu

bq. check application tracking
page:http://slave1:8088/proxy/application_1452763526769_0011/
Then , ...

Have you done the above to see what error was in each attempt ?

Which Spark / hadoop release are you using ?

Thanks

On Fri, Jan 15, 2016 at 5:58 AM, Siddharth Ubale <
siddharth.ub...@syncoms.com> wrote:

> Hi,
>
>
>
> I am trying to run a Spark streaming application in yarn-cluster mode.
> However I am facing an issue where the job ends asking for a particular
> Hadoop_conf_**.zip file in hdfs location.
>
> Can any one guide with this?
>
> The application works fine in local mode only it stops abruptly for want
> of memory.
>
>
>
> Below is the error stack trace:
>
>
>
> diagnostics: Application application_1452763526769_0011 failed 2 times due
> to AM Container for appattempt_1452763526769_0011_02 exited with
> exitCode: -1000
>
> For more detailed output, check application tracking page:
> http://slave1:8088/proxy/application_1452763526769_0011/Then, click on
> links to logs of each attempt.
>
> Diagnostics: File does not exist:
> hdfs://slave1:9000/user/hduser/.sparkStaging/application_1452763526769_0011/__hadoop_conf__1057113228186399290.zip
>
> *java.io.FileNotFoundException: File does not exist:
> hdfs://slave1:9000/user/hduser/.sparkStaging/application_1452763526769_0011/__hadoop_conf__1057113228186399290.zip*
>
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
>
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
>
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
>
> at
> org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
>
> at
> org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
>
> at java.security.AccessController.doPrivileged(Native
> Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:422)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
>
>
> Failing this attempt. Failing the application.
>
> ApplicationMaster host: N/A
>
> ApplicationMaster RPC port: -1
>
> queue: default
>
> start time: 1452866026622
>
> final status: FAILED
>
> tracking URL:
> http://slave1:8088/cluster/app/application_1452763526769_0011
>
> user: hduser
>
> Exception in thread "main" org.apache.spark.SparkException: Application
> application_1452763526769_0011 finished with failed status
>
> at
> org.apache.spark.deploy.yarn.Client.run(Client.scala:841)
>
> at
> org.apache.spark.deploy.yarn.Client$.main(Client.scala:867)
>
> at org.apache.spark.deploy.yarn.Client.main(Client.scala)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:497)
>
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664)
>
> at
> org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
>
> at
> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
>
> at
> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
>
> at
> org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> 16/01/15 19:23:53 INFO Utils: Shutdown hook called
>
> 16/01/15 19:23:53 INFO Utils: Deleting directory
>

Re: Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file.

2016-01-15 Thread Ted Yu

(PhoenixConnect.java:26)
>
> at
> spark.stream.eventStream.startStream(eventStream.java:105)
>
> at
> time.series.wo.agg.InputStreamSpark.main(InputStreamSpark.java:38)
>
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
>
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
> at java.lang.reflect.Method.invoke(Method.java:497)
>
> at
> org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
>
> Thanks,
>
> Siddharth
>
>
>
>
>
> *From:* Ted Yu [mailto:yuzhih...@gmail.com]
> *Sent:* Friday, January 15, 2016 7:43 PM
> *To:* Siddharth Ubale <siddharth.ub...@syncoms.com>
> *Cc:* user@spark.apache.org
> *Subject:* Re: Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file.
>
>
>
> bq. check application tracking 
> page:http://slave1:8088/proxy/application_1452763526769_0011/
> Then <http://slave1:8088/proxy/application_1452763526769_0011/Then>, ...
>
>
>
> Have you done the above to see what error was in each attempt ?
>
>
>
> Which Spark / hadoop release are you using ?
>
>
>
> Thanks
>
>
>
> On Fri, Jan 15, 2016 at 5:58 AM, Siddharth Ubale <
> siddharth.ub...@syncoms.com> wrote:
>
> Hi,
>
>
>
> I am trying to run a Spark streaming application in yarn-cluster mode.
> However I am facing an issue where the job ends asking for a particular
> Hadoop_conf_**.zip file in hdfs location.
>
> Can any one guide with this?
>
> The application works fine in local mode only it stops abruptly for want
> of memory.
>
>
>
> Below is the error stack trace:
>
>
>
> diagnostics: Application application_1452763526769_0011 failed 2 times due
> to AM Container for appattempt_1452763526769_0011_02 exited with
> exitCode: -1000
>
> For more detailed output, check application tracking page:
> http://slave1:8088/proxy/application_1452763526769_0011/Then, click on
> links to logs of each attempt.
>
> Diagnostics: File does not exist:
> hdfs://slave1:9000/user/hduser/.sparkStaging/application_1452763526769_0011/__hadoop_conf__1057113228186399290.zip
>
> *java.io.FileNotFoundException: File does not exist:
> hdfs://slave1:9000/user/hduser/.sparkStaging/application_1452763526769_0011/__hadoop_conf__1057113228186399290.zip*
>
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1122)
>
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
>
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:251)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:61)
>
> at
> org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
>
> at
> org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:357)
>
> at java.security.AccessController.doPrivileged(Native
> Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:422)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:356)
>
> at
> org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:60)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
>
>
> Failing this attempt. Failing the application.
>
> ApplicationMaster host: N/A
>
> ApplicationMaster RPC port: -1
>
> queue: default
>
> start time: 1452866026622
>
> final status: FAILED
>
> tracking URL:
> http://slave1:8088/cl

RE: Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file.

2016-01-15 Thread Siddharth Ubale

Hi,


This is the log from the application :

16/01/15 19:23:19 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster 
with SUCCEEDED (diag message: Shutdown hook called before final status was 
reported.)
16/01/15 19:23:19 INFO yarn.ApplicationMaster: Deleting staging directory 
.sparkStaging/application_1452763526769_0011
16/01/15 19:23:19 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Shutting down remote daemon.
16/01/15 19:23:19 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote 
daemon shut down; proceeding with flushing remote transports.
16/01/15 19:23:19 INFO remote.RemoteActorRefProvider$RemotingTerminator: 
Remoting shut down.
16/01/15 19:23:19 INFO util.Utils: Shutdown hook called
16/01/15 19:23:19 INFO client.HConnectionManager$HConnectionImplementation: 
Closing zookeeper sessionid=0x1523f753f6f0061
16/01/15 19:23:19 INFO zookeeper.ClientCnxn: EventThread shut down
16/01/15 19:23:19 INFO zookeeper.ZooKeeper: Session: 0x1523f753f6f0061 closed
16/01/15 19:23:19 ERROR yarn.ApplicationMaster: User class threw exception: 
java.lang.NoSuchMethodError: 
org.apache.hadoop.hbase.client.Put.setWriteToWAL(Z)Lorg/apache/hadoop/hbase/client/Put;
java.lang.NoSuchMethodError: 
org.apache.hadoop.hbase.client.Put.setWriteToWAL(Z)Lorg/apache/hadoop/hbase/client/Put;
at 
org.apache.phoenix.schema.PTableImpl$PRowImpl.newMutations(PTableImpl.java:639)
at 
org.apache.phoenix.schema.PTableImpl$PRowImpl.(PTableImpl.java:632)
at 
org.apache.phoenix.schema.PTableImpl.newRow(PTableImpl.java:557)
at 
org.apache.phoenix.schema.PTableImpl.newRow(PTableImpl.java:573)
at 
org.apache.phoenix.execute.MutationState.addRowMutations(MutationState.java:185)
at 
org.apache.phoenix.execute.MutationState.access$200(MutationState.java:79)
at 
org.apache.phoenix.execute.MutationState$2.init(MutationState.java:258)
at 
org.apache.phoenix.execute.MutationState$2.(MutationState.java:255)
at 
org.apache.phoenix.execute.MutationState.toMutations(MutationState.java:253)
at 
org.apache.phoenix.execute.MutationState.toMutations(MutationState.java:243)
at 
org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:1840)
at 
org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:744)
at 
org.apache.phoenix.compile.CreateTableCompiler$2.execute(CreateTableCompiler.java:186)
at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:303)
at 
org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:295)
at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
at 
org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:293)
at 
org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1236)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1891)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:1860)
at 
org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:77)
at 
org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:1860)
at 
org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:162)
at 
org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.connect(PhoenixEmbeddedDriver.java:131)
at 
org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:133)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:270)
at 
spark.phoenix.PhoenixConnect.getConnection(PhoenixConnect.java:26)
at spark.stream.eventStream.startStream(eventStream.java:105)
at 
time.series.wo.agg.InputStreamSpark.main(InputStreamSpark.java:38)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at 
org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:483)
Thanks,
Siddharth


From: Ted Yu [mailto:yuzhih...@gmail.com]
Sent: Friday, January 15, 2016 7:43 PM
To: Siddharth Ubale <siddharth.ub...@syncoms.com>
Cc: user@spark.apache.org
Subject: Re: Spark App -Yarn-Cluster-Mode ===> Hadoop_conf_**.zip file.

bq. check application tracking 
page:http://slave1:8

Re: [Spark on YARN] Multiple Auxiliary Shuffle Service Versions

2016-01-06 Thread Deenar Toraskar

Hi guys

   1. >> Add this jar to the classpath of all NodeManagers in your cluster.

A related question on configuration of the auxillary shuffle service. *How
do i find the classpath for NodeManager?* I tried finding all places where
the existing mapreduce shuffle jars are present and place the spark yarn
shuffle jar in the same location, but with no success.

$ find . -name *shuffle*.jar
./hadoop/client/hadoop-mapreduce-client-shuffle.jar
./hadoop/client/hadoop-mapreduce-client-shuffle-2.7.1.2.3.2.0-2950.jar
./hadoop/client/spark-1.6.0-SNAPSHOT-yarn-shuffle.jar
./hadoop-mapreduce/hadoop-mapreduce-client-shuffle.jar
./hadoop-mapreduce/hadoop-mapreduce-client-shuffle-2.7.1.2.3.2.0-2950.jar
./falcon/client/lib/hadoop-mapreduce-client-shuffle-2.7.1.2.3.2.0-2950.jar
./oozie/libserver/hadoop-mapreduce-client-shuffle-2.7.1.2.3.2.0-2950.jar
./oozie/libtools/hadoop-mapreduce-client-shuffle-2.7.1.2.3.2.0-2950.jar
./spark/lib/spark-1.4.1.2.3.2.0-2950-yarn-shuffle.jar
Regards
Deenar

On 7 October 2015 at 01:27, Alex Rovner  wrote:

> Thank you all for your help.
>
> *Alex Rovner*
> *Director, Data Engineering *
> *o:* 646.759.0052
>
> * *
>
> On Tue, Oct 6, 2015 at 11:17 AM, Steve Loughran 
> wrote:
>
>>
>> On 6 Oct 2015, at 01:23, Andrew Or  wrote:
>>
>> Both the history server and the shuffle service are backward compatible,
>> but not forward compatible. This means as long as you have the latest
>> version of history server / shuffle service running in your cluster then
>> you're fine (you don't need multiple of them).
>>
>>
>> FWIW I've just created a JIRA on tracking/reporting version mismatch on
>> history server playback better:
>> https://issues.apache.org/jira/browse/SPARK-10950
>>
>> Even though the UI can't be expected to playback later histories, it
>> could be possible to report the issue in a way that users can act on "run a
>> later version", rather than raise support calls.
>>
>>
>

Re: Spark 1.6 - YARN Cluster Mode

2015-12-21 Thread Akhil Das

Try adding these properties:

spark.driver.extraJavaOptions -Dhdp.version=2.3.2.0-2950
spark.yarn.am.extraJavaOptions -Dhdp.version=2.3.2.0-2950

There was a similar discussion with Spark 1.3.0 over here
http://stackoverflow.com/questions/29470542/spark-1-3-0-running-pi-example-on-yarn-fails



Thanks
Best Regards

On Fri, Dec 18, 2015 at 1:33 AM, syepes  wrote:

> Hello,
>
> This week I have been testing 1.6 (#d509194b) in our HDP 2.3 platform and
> its been working pretty ok, at the exception of the YARN cluster deployment
> mode.
> Note that with 1.5 using the same "spark-props.conf" and "spark-env.sh"
> config files the cluster mode works as expected.
>
> Has anyone else also tried the cluster mode in 1.6?
>
>
> Problem reproduction:
> 
> # spark-submit --master yarn --deploy-mode cluster --num-executors 1
> --properties-file $PWD/spark-props.conf --class
> org.apache.spark.examples.SparkPi
> /opt/spark/lib/spark-examples-1.6.0-SNAPSHOT-hadoop2.7.1.jar
>
> Error: Could not find or load main class
> org.apache.spark.deploy.yarn.ApplicationMaster
>
> spark-props.conf
> -
> spark.driver.extraJavaOptions-Dhdp.version=2.3.2.0-2950
> spark.driver.extraLibraryPath
>
> /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64
> spark.executor.extraJavaOptions  -Dhdp.version=2.3.2.0-2950
> spark.executor.extraLibraryPath
>
> /usr/hdp/current/hadoop-client/lib/native:/usr/hdp/current/hadoop-client/lib/native/Linux-amd64-64
> -
>
> I will try to do some more debugging on this issue.
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-6-YARN-Cluster-Mode-tp25729.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Spark on YARN multitenancy

2015-12-15 Thread Ben Roling

I'm curious to see the feedback others will provide.  My impression is the
only way to get Spark to give up resources while it is idle would be to use
the preemption feature of the scheduler you're using in YARN.  When another
user comes along the scheduler would preempt one or more Spark executors to
free the resources the user is entitled to.  The question becomes how much
inefficiency the preemption creates due to lost work that has to be redone
by the Spark job.  I'm not sure the best way to generalize a thought about
how big of a deal that would be.  I imagine it depends on several factors.

On Tue, Dec 15, 2015 at 9:31 AM David Fox  wrote:

> Hello Spark experts,
>
> We are currently evaluating Spark on our cluster that already supports
> MRv2 over YARN.
>
> We have noticed a problem with running jobs concurrently, in particular
> that a running Spark job will not release its resources until the job is
> finished. Ideally, if two people run any combination of MRv2 and Spark
> jobs, the resources should be fairly distributed.
>
> I have noticed a feature called "dynamic resource allocation" in Spark
> 1.2, but this does not seem to be solving the problem, because it releases
> resources only when Spark is IDLE, not while it's BUSY. What I am looking
> for is similar approch to MapReduce where a new user obtains fair share of
> resources
>
> I haven't been able to locate any further information on this matter. On
> the other hand, I feel this must be pretty common issue for a lot of users.
>
> So,
>
>1. What is your experience when dealing with multitenant (multiple
>users) Spark cluster with YARN?
>2. Is Spark architectually adept to support releasing resources while
>it's busy? Is this a planned feature or is it something that conflicts with
>the idea of Spark executors?
>
> Thanks
>

Re: Spark on YARN multitenancy

2015-12-15 Thread Ashwin Sai Shankar

We run large multi-tenant clusters with spark/hadoop workloads, and we use
'yarn's preemption'/'spark's dynamic allocation' to achieve multitenancy.

See following link on how to enable/configure preemption using fair
scheduler :
http://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/FairScheduler.html



On Tue, Dec 15, 2015 at 9:37 AM, Ben Roling  wrote:

> Oops - I meant while it is *busy* when I said while it is *idle*.
>
> On Tue, Dec 15, 2015 at 11:35 AM Ben Roling  wrote:
>
>> I'm curious to see the feedback others will provide.  My impression is
>> the only way to get Spark to give up resources while it is idle would be to
>> use the preemption feature of the scheduler you're using in YARN.  When
>> another user comes along the scheduler would preempt one or more Spark
>> executors to free the resources the user is entitled to.  The question
>> becomes how much inefficiency the preemption creates due to lost work that
>> has to be redone by the Spark job.  I'm not sure the best way to generalize
>> a thought about how big of a deal that would be.  I imagine it depends on
>> several factors.
>>
>> On Tue, Dec 15, 2015 at 9:31 AM David Fox  wrote:
>>
>>> Hello Spark experts,
>>>
>>> We are currently evaluating Spark on our cluster that already supports
>>> MRv2 over YARN.
>>>
>>> We have noticed a problem with running jobs concurrently, in particular
>>> that a running Spark job will not release its resources until the job is
>>> finished. Ideally, if two people run any combination of MRv2 and Spark
>>> jobs, the resources should be fairly distributed.
>>>
>>> I have noticed a feature called "dynamic resource allocation" in Spark
>>> 1.2, but this does not seem to be solving the problem, because it releases
>>> resources only when Spark is IDLE, not while it's BUSY. What I am looking
>>> for is similar approch to MapReduce where a new user obtains fair share of
>>> resources
>>>
>>> I haven't been able to locate any further information on this matter. On
>>> the other hand, I feel this must be pretty common issue for a lot of users.
>>>
>>> So,
>>>
>>>1. What is your experience when dealing with multitenant (multiple
>>>users) Spark cluster with YARN?
>>>2. Is Spark architectually adept to support releasing resources
>>>while it's busy? Is this a planned feature or is it something that
>>>conflicts with the idea of Spark executors?
>>>
>>> Thanks
>>>
>>

Re: Spark on YARN multitenancy

2015-12-15 Thread Ben Roling

Oops - I meant while it is *busy* when I said while it is *idle*.

On Tue, Dec 15, 2015 at 11:35 AM Ben Roling  wrote:

> I'm curious to see the feedback others will provide.  My impression is the
> only way to get Spark to give up resources while it is idle would be to use
> the preemption feature of the scheduler you're using in YARN.  When another
> user comes along the scheduler would preempt one or more Spark executors to
> free the resources the user is entitled to.  The question becomes how much
> inefficiency the preemption creates due to lost work that has to be redone
> by the Spark job.  I'm not sure the best way to generalize a thought about
> how big of a deal that would be.  I imagine it depends on several factors.
>
> On Tue, Dec 15, 2015 at 9:31 AM David Fox  wrote:
>
>> Hello Spark experts,
>>
>> We are currently evaluating Spark on our cluster that already supports
>> MRv2 over YARN.
>>
>> We have noticed a problem with running jobs concurrently, in particular
>> that a running Spark job will not release its resources until the job is
>> finished. Ideally, if two people run any combination of MRv2 and Spark
>> jobs, the resources should be fairly distributed.
>>
>> I have noticed a feature called "dynamic resource allocation" in Spark
>> 1.2, but this does not seem to be solving the problem, because it releases
>> resources only when Spark is IDLE, not while it's BUSY. What I am looking
>> for is similar approch to MapReduce where a new user obtains fair share of
>> resources
>>
>> I haven't been able to locate any further information on this matter. On
>> the other hand, I feel this must be pretty common issue for a lot of users.
>>
>> So,
>>
>>1. What is your experience when dealing with multitenant (multiple
>>users) Spark cluster with YARN?
>>2. Is Spark architectually adept to support releasing resources while
>>it's busy? Is this a planned feature or is it something that conflicts 
>> with
>>the idea of Spark executors?
>>
>> Thanks
>>
>

Re: Spark on YARN: java.lang.ClassCastException SerializedLambda to org.apache.spark.api.java.function.Function in instance of org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1

2015-12-06 Thread Mohamed Nadjib Mami

Your jars are not delivered to the workers. Have a look at this:
http://stackoverflow.com/questions/24052899/how-to-make-it-easier-to-deploy-my-jar-to-spark-cluster-in-standalone-mode



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-YARN-java-lang-ClassCastException-SerializedLambda-to-org-apache-spark-api-java-function-Fu1-tp21261p25602.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark on yarn vs spark standalone

2015-11-30 Thread Jacek Laskowski

Hi,

My understanding of Spark on YARN and even Spark in general is very
limited so keep that in mind.

I'm not sure why you compare yarn-cluster and spark standalone? In
yarn-cluster a driver runs on a node in the YARN cluster while spark
standalone keeps the driver on the machine you launched a Spark
application. Also, YARN cluster supports retrying applications while
standalone doesn't. There's also support for rack locality preference
(but dunno if that's used and where in Spark).

My limited understanding suggests me to use Spark on YARN if you're
considering to use Hadoop/HDFS and submitting jobs using YARN.
Standalone's an entry option where throwing in YARN could kill
introducing Spark to organizations without Hadoop YARN.

Just my two cents.

Pozdrawiam,
Jacek

--
Jacek Laskowski | https://medium.com/@jaceklaskowski/ |
http://blog.jaceklaskowski.pl
Mastering Spark https://jaceklaskowski.gitbooks.io/mastering-apache-spark/
Follow me at https://twitter.com/jaceklaskowski
Upvote at http://stackoverflow.com/users/1305344/jacek-laskowski

On Fri, Nov 27, 2015 at 8:36 AM, cs user  wrote:
> Hi All,
>
> Apologies if this question has been asked before. I'd like to know if there
> are any downsides to running spark over yarn with the --master yarn-cluster
> option vs having a separate spark standalone cluster to execute jobs?
>
> We're looking at installing a hdfs/hadoop cluster with Ambari and submitting
> jobs to the cluster using yarn, or having an Ambari cluster and a separate
> standalone spark cluster, which will run the spark jobs on data within hdfs.
>
> With yarn, will we still get all the benefits of spark?
>
> Will it be possible to process streaming data?
>
> Many thanks in advance for any responses.
>
> Cheers!

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark on yarn vs spark standalone

2015-11-30 Thread Jacek Laskowski

Hi Mark,

I said I've only managed to develop a limited understanding of how
Spark works in the different deploy modes ;-)

But somehow I thought that cluster in spark standalone is not
supported. I think I've seen a JIRA with a change quite recently where
it was said or something similar. Can't find it now :(

Pozdrawiam,
Jacek

--
Jacek Laskowski | https://medium.com/@jaceklaskowski/ |
http://blog.jaceklaskowski.pl
Mastering Spark https://jaceklaskowski.gitbooks.io/mastering-apache-spark/
Follow me at https://twitter.com/jaceklaskowski
Upvote at http://stackoverflow.com/users/1305344/jacek-laskowski


On Mon, Nov 30, 2015 at 6:58 PM, Mark Hamstra  wrote:
> Standalone mode also supports running the driver on a cluster node.  See
> "cluster" mode in
> http://spark.apache.org/docs/latest/spark-standalone.html#launching-spark-applications
> .  Also,
> http://spark.apache.org/docs/latest/spark-standalone.html#high-availability
>
> On Mon, Nov 30, 2015 at 9:47 AM, Jacek Laskowski  wrote:
>>
>> Hi,
>>
>> My understanding of Spark on YARN and even Spark in general is very
>> limited so keep that in mind.
>>
>> I'm not sure why you compare yarn-cluster and spark standalone? In
>> yarn-cluster a driver runs on a node in the YARN cluster while spark
>> standalone keeps the driver on the machine you launched a Spark
>> application. Also, YARN cluster supports retrying applications while
>> standalone doesn't. There's also support for rack locality preference
>> (but dunno if that's used and where in Spark).
>>
>> My limited understanding suggests me to use Spark on YARN if you're
>> considering to use Hadoop/HDFS and submitting jobs using YARN.
>> Standalone's an entry option where throwing in YARN could kill
>> introducing Spark to organizations without Hadoop YARN.
>>
>> Just my two cents.
>>
>> Pozdrawiam,
>> Jacek
>>
>> --
>> Jacek Laskowski | https://medium.com/@jaceklaskowski/ |
>> http://blog.jaceklaskowski.pl
>> Mastering Spark https://jaceklaskowski.gitbooks.io/mastering-apache-spark/
>> Follow me at https://twitter.com/jaceklaskowski
>> Upvote at http://stackoverflow.com/users/1305344/jacek-laskowski
>>
>>
>> On Fri, Nov 27, 2015 at 8:36 AM, cs user  wrote:
>> > Hi All,
>> >
>> > Apologies if this question has been asked before. I'd like to know if
>> > there
>> > are any downsides to running spark over yarn with the --master
>> > yarn-cluster
>> > option vs having a separate spark standalone cluster to execute jobs?
>> >
>> > We're looking at installing a hdfs/hadoop cluster with Ambari and
>> > submitting
>> > jobs to the cluster using yarn, or having an Ambari cluster and a
>> > separate
>> > standalone spark cluster, which will run the spark jobs on data within
>> > hdfs.
>> >
>> > With yarn, will we still get all the benefits of spark?
>> >
>> > Will it be possible to process streaming data?
>> >
>> > Many thanks in advance for any responses.
>> >
>> > Cheers!
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark on yarn vs spark standalone

2015-11-30 Thread Mark Hamstra

Standalone mode also supports running the driver on a cluster node.  See
"cluster" mode in
http://spark.apache.org/docs/latest/spark-standalone.html#launching-spark-applications
.  Also,
http://spark.apache.org/docs/latest/spark-standalone.html#high-availability

On Mon, Nov 30, 2015 at 9:47 AM, Jacek Laskowski  wrote:

> Hi,
>
> My understanding of Spark on YARN and even Spark in general is very
> limited so keep that in mind.
>
> I'm not sure why you compare yarn-cluster and spark standalone? In
> yarn-cluster a driver runs on a node in the YARN cluster while spark
> standalone keeps the driver on the machine you launched a Spark
> application. Also, YARN cluster supports retrying applications while
> standalone doesn't. There's also support for rack locality preference
> (but dunno if that's used and where in Spark).
>
> My limited understanding suggests me to use Spark on YARN if you're
> considering to use Hadoop/HDFS and submitting jobs using YARN.
> Standalone's an entry option where throwing in YARN could kill
> introducing Spark to organizations without Hadoop YARN.
>
> Just my two cents.
>
> Pozdrawiam,
> Jacek
>
> --
> Jacek Laskowski | https://medium.com/@jaceklaskowski/ |
> http://blog.jaceklaskowski.pl
> Mastering Spark https://jaceklaskowski.gitbooks.io/mastering-apache-spark/
> Follow me at https://twitter.com/jaceklaskowski
> Upvote at http://stackoverflow.com/users/1305344/jacek-laskowski
>
>
> On Fri, Nov 27, 2015 at 8:36 AM, cs user  wrote:
> > Hi All,
> >
> > Apologies if this question has been asked before. I'd like to know if
> there
> > are any downsides to running spark over yarn with the --master
> yarn-cluster
> > option vs having a separate spark standalone cluster to execute jobs?
> >
> > We're looking at installing a hdfs/hadoop cluster with Ambari and
> submitting
> > jobs to the cluster using yarn, or having an Ambari cluster and a
> separate
> > standalone spark cluster, which will run the spark jobs on data within
> hdfs.
> >
> > With yarn, will we still get all the benefits of spark?
> >
> > Will it be possible to process streaming data?
> >
> > Many thanks in advance for any responses.
> >
> > Cheers!
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Spark on yarn vs spark standalone

2015-11-26 Thread Jeff Zhang

If your cluster is a dedicated spark cluster (only running spark job, no
other jobs like hive/pig/mr), then spark standalone would be fine.
Otherwise I think yarn would be a better option.

On Fri, Nov 27, 2015 at 3:36 PM, cs user  wrote:

> Hi All,
>
> Apologies if this question has been asked before. I'd like to know if
> there are any downsides to running spark over yarn with the --master
> yarn-cluster option vs having a separate spark standalone cluster to
> execute jobs?
>
> We're looking at installing a hdfs/hadoop cluster with Ambari and
> submitting jobs to the cluster using yarn, or having an Ambari cluster and
> a separate standalone spark cluster, which will run the spark jobs on data
> within hdfs.
>
> With yarn, will we still get all the benefits of spark?
>
> Will it be possible to process streaming data?
>
> Many thanks in advance for any responses.
>
> Cheers!
>



-- 
Best Regards

Jeff Zhang

Re: Spark on YARN using Java 1.8 fails

2015-11-11 Thread mvle

Unfortunately, no. I switched back to OpenJDK 1.7.
Didn't get a chance to dig deeper.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-YARN-using-Java-1-8-fails-tp24925p25360.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Spark on YARN using Java 1.8 fails

2015-11-11 Thread Abel Rincón

Hi,

There was another related question

https://mail-archives.apache.org/mod_mbox/incubator-spark-user/201506.mbox/%3CCAJ2peNeruM2Y2Tbf8-Wiras-weE586LM_o25FsN=+z1-bfw...@mail.gmail.com%3E


Some months ago, if I remember well, using spark 1.3 + YARN + Java 8 we had
the same probem.
https://issues.apache.org/jira/browse/SPARK-6388

BTW, nowadays we chose use java 7

RE: Spark on Yarn

2015-10-21 Thread Jean-Baptiste Onofré



Hi
The compiled version (master side) and client version diverge on spark network 
JavaUtils. You should use the same/aligned version.
RegardsJB


Sent from my Samsung device

 Original message 
From: Raghuveer Chanda  
Date: 21/10/2015  12:33  (GMT+01:00) 
To: user@spark.apache.org 
Subject: Spark on Yarn 

Hi all,
I am trying to run spark on yarn in quickstart cloudera vm.It already has spark 
1.3 and Hadoop 2.6.0-cdh5.4.0 installed.(I am not using spark-submit since I 
want to run a different version of spark). I am able to run spark 1.3 on yarn 
but get the below error for spark 1.4.
The log shows its running on spark 1.4 but still gives a error on a method 
which is present in 1.4 and not 1.3. Even the fat jar contains the class files 
of 1.4.

As far as running in yarn the installed spark version shouldnt matter, but 
still its running on the other version.

Hadoop Version:Hadoop 2.6.0-cdh5.4.0Subversion 
http://github.com/cloudera/hadoop -r 
c788a14a5de9ecd968d1e2666e8765c5f018c271Compiled by jenkins on 
2015-04-21T19:18ZCompiled with protoc 2.5.0From source with checksum 
cd78f139c66c13ab5cee96e15a629025This command was run using 
/usr/lib/hadoop/hadoop-common-2.6.0-cdh5.4.0.jar
Error:LogType:stderrLog Upload Time:Tue Oct 20 21:58:56 -0700 
2015LogLength:2334Log Contents:SLF4J: Class path contains multiple SLF4J 
bindings.SLF4J: Found binding in 
[jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J:
 Found binding in 
[jar:file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/filecache/10/simple-yarn-app-1.1.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]SLF4J:
 See http://www.slf4j.org/codes.html#multiple_bindings for an 
explanation.SLF4J: Actual binding is of type 
[org.slf4j.impl.Log4jLoggerFactory]15/10/20 21:58:50 INFO spark.SparkContext: 
Running Spark version 1.4.015/10/20 21:58:53 INFO spark.SecurityManager: 
Changing view acls to: yarn15/10/20 21:58:53 INFO spark.SecurityManager: 
Changing modify acls to: yarn15/10/20 21:58:53 INFO spark.SecurityManager: 
SecurityManager: authentication disabled; ui acls disabled; users with view 
permissions: Set(yarn); users with modify permissions: Set(yarn)Exception in 
thread "main" java.lang.NoSuchMethodError: 
org.apache.spark.network.util.JavaUtils.timeStringAsSec(Ljava/lang/String;)J   
at org.apache.spark.util.Utils$.timeStringAsSeconds(Utils.scala:1027)   at 
org.apache.spark.SparkConf.getTimeAsSeconds(SparkConf.scala:194) at 
org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:68)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:54) at 
org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53) at 
org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1991)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141) at 
org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1982)at 
org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56)   at 
org.apache.spark.rpc.akka.AkkaRpcEnvFactory.create(AkkaRpcEnv.scala:245) at 
org.apache.spark.rpc.RpcEnv$.create(RpcEnv.scala:52) at 
org.apache.spark.SparkEnv$.create(SparkEnv.scala:247)at 
org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:188)   at 
org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:267) at 
org.apache.spark.SparkContext.(SparkContext.scala:424) at 
org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:61) at 
com.hortonworks.simpleyarnapp.HelloWorld.main(HelloWorld.java:50)15/10/20 
21:58:53 INFO util.Utils: Shutdown hook called
Please help :)

--Regards and Thanks,Raghuveer Chanda

Re: Spark on Yarn

2015-10-21 Thread Raghuveer Chanda

Hi,

So does this mean I can't run spark 1.4 fat jar on yarn without installing
spark 1.4.

I am including spark 1.4 in my pom.xml so doesn't this mean its compiling
in 1.4.


On Wed, Oct 21, 2015 at 4:38 PM, Jean-Baptiste Onofré 
wrote:

> Hi
>
> The compiled version (master side) and client version diverge on spark
> network JavaUtils. You should use the same/aligned version.
>
> Regards
> JB
>
>
>
> Sent from my Samsung device
>
>
>  Original message 
> From: Raghuveer Chanda 
> Date: 21/10/2015 12:33 (GMT+01:00)
> To: user@spark.apache.org
> Subject: Spark on Yarn
>
> Hi all,
>
> I am trying to run spark on yarn in quickstart cloudera vm.It already has
> spark 1.3 and Hadoop 2.6.0-cdh5.4.0 installed.(I am not using
> spark-submit since I want to run a different version of spark).
>
> I am able to run spark 1.3 on yarn but get the below error for spark 1.4.
>
> The log shows its running on spark 1.4 but still gives a error on a method
> which is present in 1.4 and not 1.3. Even the fat jar contains the class
> files of 1.4.
>
> As far as running in yarn the installed spark version shouldnt matter, but
> still its running on the other version.
>
>
> *Hadoop Version:*
> Hadoop 2.6.0-cdh5.4.0
> Subversion http://github.com/cloudera/hadoop -r
> c788a14a5de9ecd968d1e2666e8765c5f018c271
> Compiled by jenkins on 2015-04-21T19:18Z
> Compiled with protoc 2.5.0
> From source with checksum cd78f139c66c13ab5cee96e15a629025
> This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.4.0.jar
>
> *Error:*
> LogType:stderr
> Log Upload Time:Tue Oct 20 21:58:56 -0700 2015
> LogLength:2334
> Log Contents:
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/filecache/10/simple-yarn-app-1.1.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 15/10/20 21:58:50 INFO spark.SparkContext: *Running Spark version 1.4.0*
> 15/10/20 21:58:53 INFO spark.SecurityManager: Changing view acls to: yarn
> 15/10/20 21:58:53 INFO spark.SecurityManager: Changing modify acls to: yarn
> 15/10/20 21:58:53 INFO spark.SecurityManager: SecurityManager:
> authentication disabled; ui acls disabled; users with view permissions:
> Set(yarn); users with modify permissions: Set(yarn)
> *Exception in thread "main" java.lang.NoSuchMethodError:
> org.apache.spark.network.util.JavaUtils.timeStringAsSec(Ljava/lang/String;)J*
> at org.apache.spark.util.Utils$.timeStringAsSeconds(Utils.scala:1027)
> at org.apache.spark.SparkConf.getTimeAsSeconds(SparkConf.scala:194)
> at
> org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:68)
> at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:54)
> at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
> at
> org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1991)
> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
> at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1982)
> at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56)
> at org.apache.spark.rpc.akka.AkkaRpcEnvFactory.create(AkkaRpcEnv.scala:245)
> at org.apache.spark.rpc.RpcEnv$.create(RpcEnv.scala:52)
> at org.apache.spark.SparkEnv$.create(SparkEnv.scala:247)
> at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:188)
> at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:267)
> at org.apache.spark.SparkContext.(SparkContext.scala:424)
> at
> org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:61)
> at com.hortonworks.simpleyarnapp.HelloWorld.main(HelloWorld.java:50)
> 15/10/20 21:58:53 INFO util.Utils: Shutdown hook called
>
> Please help :)
>
> --
> Regards and Thanks,
> Raghuveer Chanda
>



-- 
Regards,
Raghuveer Chanda
Computer Science and Engineering
IIT Kharagpur
+91-9475470374

Re: Spark on Yarn

2015-10-21 Thread Adrian Tanase

The question is the spark dependency is marked as provided or is included in 
the fat jar.

For example, we are compiling the spark distro separately for java 8 + scala 
2.11 + hadoop 2.6 (with maven) and marking it as provided in sbt.

-adrian

From: Raghuveer Chanda
Date: Wednesday, October 21, 2015 at 2:14 PM
To: Jean-Baptiste Onofré
Cc: "user@spark.apache.org<mailto:user@spark.apache.org>"
Subject: Re: Spark on Yarn

Hi,

So does this mean I can't run spark 1.4 fat jar on yarn without installing 
spark 1.4.

I am including spark 1.4 in my pom.xml so doesn't this mean its compiling in 
1.4.

On Wed, Oct 21, 2015 at 4:38 PM, Jean-Baptiste Onofré 
<j...@nanthrax.net<mailto:j...@nanthrax.net>> wrote:
Hi

The compiled version (master side) and client version diverge on spark network 
JavaUtils. You should use the same/aligned version.

Regards
JB

Sent from my Samsung device

 Original message 
From: Raghuveer Chanda 
<raghuveer.cha...@gmail.com<mailto:raghuveer.cha...@gmail.com>>
Date: 21/10/2015 12:33 (GMT+01:00)
To: user@spark.apache.org<mailto:user@spark.apache.org>
Subject: Spark on Yarn

Hi all,

I am trying to run spark on yarn in quickstart cloudera vm.It already has spark 
1.3 and Hadoop 2.6.0-cdh5.4.0 installed.(I am not using spark-submit since I 
want to run a different version of spark).

I am able to run spark 1.3 on yarn but get the below error for spark 1.4.

The log shows its running on spark 1.4 but still gives a error on a method 
which is present in 1.4 and not 1.3. Even the fat jar contains the class files 
of 1.4.

As far as running in yarn the installed spark version shouldnt matter, but 
still its running on the other version.

Hadoop Version:
Hadoop 2.6.0-cdh5.4.0
Subversion http://github.com/cloudera/hadoop -r 
c788a14a5de9ecd968d1e2666e8765c5f018c271
Compiled by jenkins on 2015-04-21T19:18Z
Compiled with protoc 2.5.0
From source with checksum cd78f139c66c13ab5cee96e15a629025
This command was run using /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.4.0.jar

Error:
LogType:stderr
Log Upload Time:Tue Oct 20 21:58:56 -0700 2015
LogLength:2334
Log Contents:
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/filecache/10/simple-yarn-app-1.1.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
15/10/20 21:58:50 INFO spark.SparkContext: Running Spark version 1.4.0
15/10/20 21:58:53 INFO spark.SecurityManager: Changing view acls to: yarn
15/10/20 21:58:53 INFO spark.SecurityManager: Changing modify acls to: yarn
15/10/20 21:58:53 INFO spark.SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(yarn); users with 
modify permissions: Set(yarn)
Exception in thread "main" java.lang.NoSuchMethodError: 
org.apache.spark.network.util.JavaUtils.timeStringAsSec(Ljava/lang/String;)J
at org.apache.spark.util.Utils$.timeStringAsSeconds(Utils.scala:1027)
at org.apache.spark.SparkConf.getTimeAsSeconds(SparkConf.scala:194)
at 
org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:68)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:54)
at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
at 
org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1991)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1982)
at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:56)
at org.apache.spark.rpc.akka.AkkaRpcEnvFactory.create(AkkaRpcEnv.scala:245)
at org.apache.spark.rpc.RpcEnv$.create(RpcEnv.scala:52)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:247)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:188)
at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:267)
at org.apache.spark.SparkContext.(SparkContext.scala:424)
at org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:61)
at com.hortonworks.simpleyarnapp.HelloWorld.main(HelloWorld.java:50)
15/10/20 21:58:53 INFO util.Utils: Shutdown hook called

Please help :)

--
Regards and Thanks,
Raghuveer Chanda

--
Regards,
Raghuveer Chanda
Computer Science and Engineering
IIT Kharagpur
+91-9475470374

Re: Spark on Yarn

2015-10-21 Thread Raghuveer Chanda

Please find the attached pom.xml. I am using maven to build the fat jar and
trying to run it in yarn using

*hadoop jar simple-yarn-app-master/target/simple-yarn-app-1.1.0-shaded.jar
com.hortonworks.simpleyarnapp.Client
hdfs://quickstart.cloudera:8020/simple-yarn-app-1.1.0-shaded.jar*

Basically I am following the below code and changed the Application Master
to run a Spark application class.

https://github.com/hortonworks/simple-yarn-app

It works for 1.3 the installed version in cdh but throws error for 1.4.
When I am bundling the spark within the jar it shouldn't be the case right ?



On Wed, Oct 21, 2015 at 5:11 PM, Adrian Tanase <atan...@adobe.com> wrote:

> The question is the spark dependency is marked as provided or is included
> in the fat jar.
>
> For example, we are compiling the spark distro separately for java 8 +
> scala 2.11 + hadoop 2.6 (with maven) and marking it as provided in sbt.
>
> -adrian
>
> From: Raghuveer Chanda
> Date: Wednesday, October 21, 2015 at 2:14 PM
> To: Jean-Baptiste Onofré
> Cc: "user@spark.apache.org"
> Subject: Re: Spark on Yarn
>
> Hi,
>
> So does this mean I can't run spark 1.4 fat jar on yarn without installing
> spark 1.4.
>
> I am including spark 1.4 in my pom.xml so doesn't this mean its compiling
> in 1.4.
>
>
> On Wed, Oct 21, 2015 at 4:38 PM, Jean-Baptiste Onofré <j...@nanthrax.net>
> wrote:
>
>> Hi
>>
>> The compiled version (master side) and client version diverge on spark
>> network JavaUtils. You should use the same/aligned version.
>>
>> Regards
>> JB
>>
>>
>>
>> Sent from my Samsung device
>>
>>
>>  Original message 
>> From: Raghuveer Chanda <raghuveer.cha...@gmail.com>
>> Date: 21/10/2015 12:33 (GMT+01:00)
>> To: user@spark.apache.org
>> Subject: Spark on Yarn
>>
>> Hi all,
>>
>> I am trying to run spark on yarn in quickstart cloudera vm.It already
>> has spark 1.3 and Hadoop 2.6.0-cdh5.4.0 installed.(I am not using
>> spark-submit since I want to run a different version of spark).
>>
>> I am able to run spark 1.3 on yarn but get the below error for spark 1.4.
>>
>> The log shows its running on spark 1.4 but still gives a error on a
>> method which is present in 1.4 and not 1.3. Even the fat jar contains the
>> class files of 1.4.
>>
>> As far as running in yarn the installed spark version shouldnt matter,
>> but still its running on the other version.
>>
>>
>> *Hadoop Version:*
>> Hadoop 2.6.0-cdh5.4.0
>> Subversion http://github.com/cloudera/hadoop -r
>> c788a14a5de9ecd968d1e2666e8765c5f018c271
>> Compiled by jenkins on 2015-04-21T19:18Z
>> Compiled with protoc 2.5.0
>> From source with checksum cd78f139c66c13ab5cee96e15a629025
>> This command was run using
>> /usr/lib/hadoop/hadoop-common-2.6.0-cdh5.4.0.jar
>>
>> *Error:*
>> LogType:stderr
>> Log Upload Time:Tue Oct 20 21:58:56 -0700 2015
>> LogLength:2334
>> Log Contents:
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in
>> [jar:file:/usr/lib/zookeeper/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in
>> [jar:file:/var/lib/hadoop-yarn/cache/yarn/nm-local-dir/filecache/10/simple-yarn-app-1.1.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>> 15/10/20 21:58:50 INFO spark.SparkContext: *Running Spark version 1.4.0*
>> 15/10/20 21:58:53 INFO spark.SecurityManager: Changing view acls to: yarn
>> 15/10/20 21:58:53 INFO spark.SecurityManager: Changing modify acls to:
>> yarn
>> 15/10/20 21:58:53 INFO spark.SecurityManager: SecurityManager:
>> authentication disabled; ui acls disabled; users with view permissions:
>> Set(yarn); users with modify permissions: Set(yarn)
>> *Exception in thread "main" java.lang.NoSuchMethodError:
>> org.apache.spark.network.util.JavaUtils.timeStringAsSec(Ljava/lang/String;)J*
>> at org.apache.spark.util.Utils$.timeStringAsSeconds(Utils.scala:1027)
>> at org.apache.spark.SparkConf.getTimeAsSeconds(SparkConf.scala:194)
>> at
>> org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:68)
>> at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:54)
>> at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
>> at
>> org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$

Re: Spark on YARN using Java 1.8 fails

2015-10-12 Thread Abhisheks

Did you get any resolution for this?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-on-YARN-using-Java-1-8-fails-tp24925p25039.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: [Spark on YARN] Multiple Auxiliary Shuffle Service Versions

2015-10-06 Thread Steve Loughran


On 6 Oct 2015, at 01:23, Andrew Or 
> wrote:

Both the history server and the shuffle service are backward compatible, but 
not forward compatible. This means as long as you have the latest version of 
history server / shuffle service running in your cluster then you're fine (you 
don't need multiple of them).

FWIW I've just created a JIRA on tracking/reporting version mismatch on history 
server playback better: https://issues.apache.org/jira/browse/SPARK-10950

Even though the UI can't be expected to playback later histories, it could be 
possible to report the issue in a way that users can act on "run a later 
version", rather than raise support calls.

Re: [Spark on YARN] Multiple Auxiliary Shuffle Service Versions

2015-10-06 Thread Alex Rovner

Thank you all for your help.

*Alex Rovner*
*Director, Data Engineering *
*o:* 646.759.0052

* *

On Tue, Oct 6, 2015 at 11:17 AM, Steve Loughran 
wrote:

>
> On 6 Oct 2015, at 01:23, Andrew Or  wrote:
>
> Both the history server and the shuffle service are backward compatible,
> but not forward compatible. This means as long as you have the latest
> version of history server / shuffle service running in your cluster then
> you're fine (you don't need multiple of them).
>
>
> FWIW I've just created a JIRA on tracking/reporting version mismatch on
> history server playback better:
> https://issues.apache.org/jira/browse/SPARK-10950
>
> Even though the UI can't be expected to playback later histories, it could
> be possible to report the issue in a way that users can act on "run a later
> version", rather than raise support calls.
>
>

Re: [Spark on YARN] Multiple Auxiliary Shuffle Service Versions

2015-10-06 Thread Andreas Fritzler

Hi Andrew,

thanks a lot for the clarification!

Regards,
Andreas

On Tue, Oct 6, 2015 at 2:23 AM, Andrew Or  wrote:

> Hi all,
>
> Both the history server and the shuffle service are backward compatible,
> but not forward compatible. This means as long as you have the latest
> version of history server / shuffle service running in your cluster then
> you're fine (you don't need multiple of them).
>
> That said, an old shuffle service (e.g. 1.2) also happens to work with say
> Spark 1.4 because the shuffle file formats haven't changed. However, there
> are no guarantees that this will remain the case.
>
> -Andrew
>
> 2015-10-05 16:37 GMT-07:00 Alex Rovner :
>
>> We are running CDH 5.4 with Spark 1.3 as our main version and that
>> version is configured to use the external shuffling service. We have also
>> installed Spark 1.5 and have configured it not to use the external
>> shuffling service and that works well for us so far. I would be interested
>> myself how to configure multiple versions to use the same shuffling service.
>>
>> *Alex Rovner*
>> *Director, Data Engineering *
>> *o:* 646.759.0052
>>
>> * *
>>
>> On Mon, Oct 5, 2015 at 11:06 AM, Andreas Fritzler <
>> andreas.fritz...@gmail.com> wrote:
>>
>>> Hi Steve, Alex,
>>>
>>> how do you handle the distribution and configuration of
>>> the spark-*-yarn-shuffle.jar on your NodeManagers if you want to use 2
>>> different Spark versions?
>>>
>>> Regards,
>>> Andreas
>>>
>>> On Mon, Oct 5, 2015 at 4:54 PM, Steve Loughran 
>>> wrote:
>>>

 > On 5 Oct 2015, at 16:48, Alex Rovner 
 wrote:
 >
 > Hey Steve,
 >
 > Are you referring to the 1.5 version of the history server?
 >


 Yes. I should warn, however, that there's no guarantee that a history
 server running the 1.4 code will handle the histories of a 1.5+ job. In
 fact, I'm fairly confident it won't, as the events to get replayed are
 different.

>>>
>>>
>>
>

Re: [Spark on YARN] Multiple Auxiliary Shuffle Service Versions

2015-10-05 Thread Andrew Or

Hi all,

Both the history server and the shuffle service are backward compatible,
but not forward compatible. This means as long as you have the latest
version of history server / shuffle service running in your cluster then
you're fine (you don't need multiple of them).

That said, an old shuffle service (e.g. 1.2) also happens to work with say
Spark 1.4 because the shuffle file formats haven't changed. However, there
are no guarantees that this will remain the case.

-Andrew

2015-10-05 16:37 GMT-07:00 Alex Rovner :

> We are running CDH 5.4 with Spark 1.3 as our main version and that version
> is configured to use the external shuffling service. We have also installed
> Spark 1.5 and have configured it not to use the external shuffling service
> and that works well for us so far. I would be interested myself how to
> configure multiple versions to use the same shuffling service.
>
> *Alex Rovner*
> *Director, Data Engineering *
> *o:* 646.759.0052
>
> * *
>
> On Mon, Oct 5, 2015 at 11:06 AM, Andreas Fritzler <
> andreas.fritz...@gmail.com> wrote:
>
>> Hi Steve, Alex,
>>
>> how do you handle the distribution and configuration of
>> the spark-*-yarn-shuffle.jar on your NodeManagers if you want to use 2
>> different Spark versions?
>>
>> Regards,
>> Andreas
>>
>> On Mon, Oct 5, 2015 at 4:54 PM, Steve Loughran 
>> wrote:
>>
>>>
>>> > On 5 Oct 2015, at 16:48, Alex Rovner  wrote:
>>> >
>>> > Hey Steve,
>>> >
>>> > Are you referring to the 1.5 version of the history server?
>>> >
>>>
>>>
>>> Yes. I should warn, however, that there's no guarantee that a history
>>> server running the 1.4 code will handle the histories of a 1.5+ job. In
>>> fact, I'm fairly confident it won't, as the events to get replayed are
>>> different.
>>>
>>
>>
>

Re: [Spark on YARN] Multiple Auxiliary Shuffle Service Versions

2015-10-05 Thread Alex Rovner

We are running CDH 5.4 with Spark 1.3 as our main version and that version
is configured to use the external shuffling service. We have also installed
Spark 1.5 and have configured it not to use the external shuffling service
and that works well for us so far. I would be interested myself how to
configure multiple versions to use the same shuffling service.

*Alex Rovner*
*Director, Data Engineering *
*o:* 646.759.0052

* *

On Mon, Oct 5, 2015 at 11:06 AM, Andreas Fritzler <
andreas.fritz...@gmail.com> wrote:

> Hi Steve, Alex,
>
> how do you handle the distribution and configuration of
> the spark-*-yarn-shuffle.jar on your NodeManagers if you want to use 2
> different Spark versions?
>
> Regards,
> Andreas
>
> On Mon, Oct 5, 2015 at 4:54 PM, Steve Loughran 
> wrote:
>
>>
>> > On 5 Oct 2015, at 16:48, Alex Rovner  wrote:
>> >
>> > Hey Steve,
>> >
>> > Are you referring to the 1.5 version of the history server?
>> >
>>
>>
>> Yes. I should warn, however, that there's no guarantee that a history
>> server running the 1.4 code will handle the histories of a 1.5+ job. In
>> fact, I'm fairly confident it won't, as the events to get replayed are
>> different.
>>
>
>

Re: [Spark on YARN] Multiple Auxiliary Shuffle Service Versions

2015-10-05 Thread Steve Loughran


> On 5 Oct 2015, at 15:59, Alex Rovner  wrote:
> 
> I have the same question about the history server. We are trying to run 
> multiple versions of Spark and are wondering if the history server is 
> backwards compatible.

yes, it supports the pre-1.4 "Single attempt" logs as well as the 1.4+ multiple 
attempt model.


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: [Spark on YARN] Multiple Auxiliary Shuffle Service Versions

2015-10-05 Thread Andreas Fritzler

Hi Steve, Alex,

how do you handle the distribution and configuration of
the spark-*-yarn-shuffle.jar on your NodeManagers if you want to use 2
different Spark versions?

Regards,
Andreas

On Mon, Oct 5, 2015 at 4:54 PM, Steve Loughran 
wrote:

>
> > On 5 Oct 2015, at 16:48, Alex Rovner  wrote:
> >
> > Hey Steve,
> >
> > Are you referring to the 1.5 version of the history server?
> >
>
>
> Yes. I should warn, however, that there's no guarantee that a history
> server running the 1.4 code will handle the histories of a 1.5+ job. In
> fact, I'm fairly confident it won't, as the events to get replayed are
> different.
>

Re: [Spark on YARN] Multiple Auxiliary Shuffle Service Versions

2015-10-05 Thread Alex Rovner

I have the same question about the history server. We are trying to run
multiple versions of Spark and are wondering if the history server is
backwards compatible.

*Alex Rovner*
*Director, Data Engineering *
*o:* 646.759.0052

* *

On Mon, Oct 5, 2015 at 9:22 AM, Andreas Fritzler  wrote:

> Hi,
>
> I was just wondering, if it is possible to register multiple versions of
> the aux-services with YARN as described in the documentation:
>
>
>
>1. In the yarn-site.xml on each node, add spark_shuffle to
>yarn.nodemanager.aux-services, then set
>yarn.nodemanager.aux-services.spark_shuffle.class to
>org.apache.spark.network.yarn.YarnShuffleService. Additionally, set
>all relevantspark.shuffle.service.* configurations
>.
>
> The reason for the question is: I am trying to run multiple versions of
> Spark in parallel. Does anybody have any experience on how such a dual
> version operation holds up in terms of downward-compatibility?
>
> Maybe sticking to the latest version of the aux-service will do the trick?
>
> Regards,
> Andreas
>
> [1]
> http://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation
>
>

Re: [Spark on YARN] Multiple Auxiliary Shuffle Service Versions

2015-10-05 Thread Alex Rovner

Hey Steve,

Are you referring to the 1.5 version of the history server?

*Alex Rovner*
*Director, Data Engineering *
*o:* 646.759.0052

* *

On Mon, Oct 5, 2015 at 10:18 AM, Steve Loughran 
wrote:

>
> > On 5 Oct 2015, at 15:59, Alex Rovner  wrote:
> >
> > I have the same question about the history server. We are trying to run
> multiple versions of Spark and are wondering if the history server is
> backwards compatible.
>
> yes, it supports the pre-1.4 "Single attempt" logs as well as the 1.4+
> multiple attempt model.
>
>

1 2 3 >

1 - 100 of 239 matches

Mail list logo