Re: Spark compilation with CDH 4.5.0

Kevin Moulart Mon, 23 Dec 2013 10:54:34 -0800

Hey thanks again, that's what I feared anyway, but I had hope they would
assure retrocompatibility for this new build of CDH.


I'll try with the standalone cluster manager.

Thanks again, and merry christmas !


2013/12/23 Patrick Wendell <[email protected]>

> Hey Kevin,
>
> I looked some more. It turns out CDH 4.4 and 4.5 include changes to
> the YARN API that are not compatible with Spark's YARN implementation,
> specifically [1].
>
> We've gone through some effort to make Spark-yarn work well with both
> the YARN 2.2 stable API's (which will be in CDH5) and one popular
> version of the "alpha" YARN API's (the one in CDH 4.1-4.3 and used
> inside of Yahoo).
>
> However right now Spark doesn't support this particular version, as it
> is a one-off build and only in CDH 4.4/5. So the solution here is to
> either rollback to an earlier CDH, to patch Spark to work with CDH
> 4.4's version of the YARN API, or just deploy Spark using the
> standalone cluster manager instead of using YARN (AFAIK YARN is still
> considered experimental in CDH4.X anyways).
>
> [1] https://issues.apache.org/jira/browse/YARN-45
>
> On Mon, Dec 23, 2013 at 9:27 AM, Kevin Moulart <[email protected]>
> wrote:
> > Hi thanks for the answer, I'm using this command to compile :
> >
> >
> > SPARK_HADOOP_VERSION=2.0.0-cdh4.5.0 SPARK_YARN=true ./sbt/sbt assembly
> >
> >
> > When I do that, it runs for about 3-5 minutes and then after a long
> > "packaging" phase, it simply says it failed.
> >
> > 2013/12/23 Patrick Wendell <[email protected]>
> >>
> >> Hey Kevin,
> >>
> >> Could you give us the exact command that you are using to compile?
> >> It's possible the YARN API changed in CDH 4.5 and our heuristics don't
> >> detect it correctly.
> >>
> >> On Mon, Dec 23, 2013 at 6:43 AM, Kevin Moulart <[email protected]>
> >> wrote:
> >> > I just tried to compile version 0.8.1 against CDH-4.5.0 and it failed
> >> > just
> >> > the same.
> >> >
> >> > Le mardi 17 décembre 2013 20:56:24 UTC+1, Debasish Das a écrit :
> >> >>
> >> >> Thanks Matei.
> >> >>
> >> >> We will wait for the release candidate of Spark 0.8.1 and whether it
> >> >> could
> >> >> be run against latest CDH/HDP YARN.
> >> >>
> >> >> On Monday, December 16, 2013 10:02:41 PM UTC-8, Matei Zaharia wrote:
> >> >>>
> >> >>> Ah, this is because of a YARN API update in CDH 4.5.0 (as well as
> >> >>> Apache
> >> >>> Hadoop 2.2). You’ll need to wait for Spark 0.8.1 to compile against
> >> >>> that.
> >> >>> There is a release candidate posted on our Apache mailing list:
> >> >>> http://spark.incubator.apache.org/mailing-lists.html.
> >> >>>
> >> >>> Matei
> >> >>>
> >> >>> On Dec 16, 2013, at 4:51 PM, Debasish Das <[email protected]>
> >> >>> wrote:
> >> >>>
> >> >>> Hi Patrick,
> >> >>>
> >> >>> With the following configs:
> >> >>>
> >> >>> export SPARK_HADOOP_VERSION=2.0.0-cdh4.5.0
> >> >>> export SPARK_YARN=true
> >> >>>
> >> >>> Inside the project yarn, the errors are as follows:
> >> >>>
> >> >>> [warn]
> >> >>>
> >> >>>
> /home/debasish/sag_spark/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:59:
> >> >>> Treating numbers with a leading zero as octal is deprecated.
> >> >>> [warn]   val STAGING_DIR_PERMISSION: FsPermission =
> >> >>> FsPermission.createImmutable(0700:Short)
> >> >>> [warn]
> >> >>> ^
> >> >>> [warn]
> >> >>>
> >> >>>
> /home/debasish/sag_spark/yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala:62:
> >> >>> Treating numbers with a leading zero as octal is deprecated.
> >> >>> [warn]   val APP_FILE_PERMISSION: FsPermission =
> >> >>> FsPermission.createImmutable(0644:Short)
> >> >>> [warn]
> >> >>> ^
> >> >>> [error]
> >> >>>
> >> >>>
> /home/debasish/sag_spark/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:36:
> >> >>> object AMResponse is not a member of package
> >> >>> org.apache.hadoop.yarn.api.records
> >> >>> [error] import org.apache.hadoop.yarn.api.records.{AMResponse,
> >> >>> ApplicationAttemptId}
> >> >>> [error]        ^
> >> >>> [error]
> >> >>>
> >> >>>
> /home/debasish/sag_spark/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala:105:
> >> >>> value getAMResponse is not a member of
> >> >>> org.apache.hadoop.yarn.api.protocolrecords.AllocateResponse
> >> >>> [error]     val amResp =
> >> >>> allocateWorkerResources(workersToRequest).getAMResponse
> >> >>> [error]                                                            ^
> >> >>> [warn] two warnings found
> >> >>> [error] two errors found
> >> >>> [error] (yarn/compile:compile) Compilation failed
> >> >>> [error] Total time: 15 s, completed Dec 16, 2013 7:47:03 PM
> >> >>>
> >> >>> Note that I can run the code against cdh4.5.0 mr1 client but we need
> >> >>> the
> >> >>> YARN jar for deployment.
> >> >>>
> >> >>> Thanks.
> >> >>> Deb
> >> >>>
> >> >>> On Friday, December 13, 2013 11:03:32 AM UTC-8, Patrick Wendell
> wrote:
> >> >>>>
> >> >>>> What errors are you getting in this case? Is it the same errors as
> >> >>>> before or something else?
> >> >>>>
> >> >>>>
> >> >>>> On Thu, Dec 12, 2013 at 11:54 PM, Debasish Das <
> [email protected]>
> >> >>>> wrote:
> >> >>>>>
> >> >>>>> Thanks TD. sbt clean helped.
> >> >>>>>
> >> >>>>> With these configs I could get the jar file and it runs fine on
> the
> >> >>>>> standalone spark cluster:
> >> >>>>>
> >> >>>>> export SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.5.0
> >> >>>>> export SPARK_YARN=false
> >> >>>>>
> >> >>>>> If I try to generate the deployment jar for YARN with the
> following
> >> >>>>> configs, I am getting errors.
> >> >>>>>
> >> >>>>> export SPARK_HADOOP_VERSION=2.0.0-cdh4.5.0
> >> >>>>> export SPARK_YARN=true
> >> >>>>>
> >> >>>>> Thanks.
> >> >>>>> Deb
> >> >>>>>
> >> >>>>>
> >> >>>>> On Thursday, December 12, 2013 10:39:05 PM UTC-8, TD wrote:
> >> >>>>>>
> >> >>>>>> Can you try doing a "sbt clean" before building? I have seen this
> >> >>>>>> error once and a clean build helped.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>> On Thu, Dec 12, 2013 at 10:37 PM, Debasish Das
> >> >>>>>> <[email protected]>
> >> >>>>>> wrote:
> >> >>>>>>>
> >> >>>>>>> Hi,
> >> >>>>>>>
> >> >>>>>>> I could compile Spark with CDH 4.2.0 but when I tried to access
> >> >>>>>>> hdfs
> >> >>>>>>> it failed.
> >> >>>>>>>
> >> >>>>>>> I looked for the old post on Spark user group and found that
> Spark
> >> >>>>>>> should be compiled with the exact hadoop client version of the
> >> >>>>>>> cluster.
> >> >>>>>>>
> >> >>>>>>> Our cluster is at CDH 4.5.0. I put the following configs for the
> >> >>>>>>> compilation on the master branch:
> >> >>>>>>>
> >> >>>>>>> export SPARK_HADOOP_VERSION=2.0.0-cdh4.5.0
> >> >>>>>>> export SPARK_YARN=true
> >> >>>>>>>
> >> >>>>>>> I also tried to see if I can build against the client only
> >> >>>>>>>
> >> >>>>>>> export SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.5.0
> >> >>>>>>> export SPARK_YARN=false
> >> >>>>>>>
> >> >>>>>>> I am getting 43 compilation errors from spark-streaming project.
> >> >>>>>>>
> >> >>>>>>> I have attached few msgs.
> >> >>>>>>>
> >> >>>>>>> [error]
> >> >>>>>>>
> >> >>>>>>>
> /home/debasish/sag_spark/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala:51:
> >> >>>>>>> type mismatch;
> >> >>>>>>> [error]  found   : org.apache.spark.streaming.DStream[(K, V)]
> >> >>>>>>> [error]  required:
> >> >>>>>>> org.apache.spark.streaming.api.java.JavaPairDStream[K,V]
> >> >>>>>>> [error]  Note: implicit method fromPairDStream is not applicable
> >> >>>>>>> here
> >> >>>>>>> because it comes after the application point and it lacks an
> >> >>>>>>> explicit result
> >> >>>>>>> type
> >> >>>>>>> [error]     dstream.filter((x => f(x).booleanValue()))
> >> >>>>>>> [error]                   ^
> >> >>>>>>> [error]
> >> >>>>>>>
> >> >>>>>>>
> /home/debasish/sag_spark/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala:54:
> >> >>>>>>> type mismatch;
> >> >>>>>>> [error]  found   : org.apache.spark.streaming.DStream[(K, V)]
> >> >>>>>>> [error]  required:
> >> >>>>>>> org.apache.spark.streaming.api.java.JavaPairDStream[K,V]
> >> >>>>>>> [error]  Note: implicit method fromPairDStream is not applicable
> >> >>>>>>> here
> >> >>>>>>> because it comes after the application point and it lacks an
> >> >>>>>>> explicit result
> >> >>>>>>> type
> >> >>>>>>> [error]   def cache(): JavaPairDStream[K, V] = dstream.cache()
> >> >>>>>>> [error]                                                     ^
> >> >>>>>>> [error]
> >> >>>>>>>
> >> >>>>>>>
> /home/debasish/sag_spark/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala:57:
> >> >>>>>>> type mismatch;
> >> >>>>>>> [error]  found   : org.apache.spark.streaming.DStream[(K, V)]
> >> >>>>>>> [error]  required:
> >> >>>>>>> org.apache.spark.streaming.api.java.JavaPairDStream[K,V]
> >> >>>>>>> [error]  Note: implicit method fromPairDStream is not applicable
> >> >>>>>>> here
> >> >>>>>>> because it comes after the application point and it lacks an
> >> >>>>>>> explicit result
> >> >>>>>>> type
> >> >>>>>>> [error]   def persist(): JavaPairDStream[K, V] =
> dstream.persist()
> >> >>>>>>> [error]
> ^
> >> >>>>>>> [error]
> >> >>>>>>>
> >> >>>>>>>
> /home/debasish/sag_spark/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala:60:
> >> >>>>>>> type mismatch;
> >> >>>>>>> [error]  found   : org.apache.spark.streaming.DStream[(K, V)]
> >> >>>>>>> [error]  required:
> >> >>>>>>> org.apache.spark.streaming.api.java.JavaPairDStream[K,V]
> >> >>>>>>> [error]  Note: implicit method fromPairDStream is not applicable
> >> >>>>>>> here
> >> >>>>>>> because it comes after the application point and it lacks an
> >> >>>>>>> explicit result
> >> >>>>>>> type
> >> >>>>>>> [error]   def persist(storageLevel: StorageLevel):
> >> >>>>>>> JavaPairDStream[K,
> >> >>>>>>> V] = dstream.persist(storageLevel)
> >> >>>>>>> [error]
> >> >>>>>>> ^
> >> >>>>>>> [error]
> >> >>>>>>>
> >> >>>>>>>
> /home/debasish/sag_spark/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala:66:
> >> >>>>>>> type mismatch;
> >> >>>>>>> [error]  found   : org.apache.spark.streaming.DStream[(K, V)]
> >> >>>>>>> [error]  required:
> >> >>>>>>> org.apache.spark.streaming.api.java.JavaPairDStream[K,V]
> >> >>>>>>> [error]  Note: implicit method fromPairDStream is not applicable
> >> >>>>>>> here
> >> >>>>>>> because it comes after the application point and it lacks an
> >> >>>>>>> explicit result
> >> >>>>>>> type
> >> >>>>>>> [error]   def repartition(numPartitions: Int):
> JavaPairDStream[K,
> >> >>>>>>> V]
> >> >>>>>>> = dstream.repartition(numPartitions)
> >> >>>>>>> [error]
> >> >>>>>>> ^
> >> >>>>>>> [error]
> >> >>>>>>>
> >> >>>>>>>
> /home/debasish/sag_spark/streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaPairDStream.scala:83:
> >> >>>>>>> type mismatch;
> >> >>>>>>> [error]  found   : org.apache.spark.streaming.DStream[(K, V)]
> >> >>>>>>> [error]  required:
> >> >>>>>>> org.apache.spark.streaming.api.java.JavaPairDStream[K,V]
> >> >>>>>>> [error]  Note: implicit method fromPairDStream is not applicable
> >> >>>>>>> here
> >> >>>>>>> because it comes after the application point and it lacks an
> >> >>>>>>> explicit result
> >> >>>>>>> type
> >> >>>>>>> [error]     dstream.window(windowDuration)
> >> >>>>>>> [error]                   ^
> >> >>>>>>>
> >> >>>>>>> Note that the project compiled fine with CDH 4.2.0 but I could
> not
> >> >>>>>>> access our HDFS data.
> >> >>>>>>>
> >> >>>>>>> Thanks.
> >> >>>>>>> Deb
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> --
> >> >>>>>>> You received this message because you are subscribed to the
> Google
> >> >>>>>>> Groups "Spark Users" group.
> >> >>>>>>> To unsubscribe from this group and stop receiving emails from
> it,
> >> >>>>>>> send an email to [email protected].
> >> >>>>>>>
> >> >>>>>>> For more options, visit
> https://groups.google.com/groups/opt_out.
> >> >>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>> --
> >> >>>>> You received this message because you are subscribed to the Google
> >> >>>>> Groups "Spark Users" group.
> >> >>>>> To unsubscribe from this group and stop receiving emails from it,
> >> >>>>> send
> >> >>>>> an email to [email protected].
> >> >>>>> For more options, visit https://groups.google.com/groups/opt_out.
> >> >>>>
> >> >>>>
> >> >>>
> >> >>> --
> >> >>> You received this message because you are subscribed to the Google
> >> >>> Groups
> >> >>> "Spark Users" group.
> >> >>> To unsubscribe from this group and stop receiving emails from it,
> send
> >> >>> an
> >> >>> email to [email protected].
> >> >>> For more options, visit https://groups.google.com/groups/opt_out.
> >> >>>
> >> >>>
> >> > --
> >> > You received this message because you are subscribed to the Google
> >> > Groups
> >> > "Spark Users" group.
> >> > To unsubscribe from this group and stop receiving emails from it, send
> >> > an
> >> > email to [email protected].
> >> > For more options, visit https://groups.google.com/groups/opt_out.
> >>
> >> --
> >> You received this message because you are subscribed to a topic in the
> >> Google Groups "Spark Users" group.
> >> To unsubscribe from this topic, visit
> >> https://groups.google.com/d/topic/spark-users/T1soH67C5M4/unsubscribe.
> >> To unsubscribe from this group and all its topics, send an email to
> >> [email protected].
> >>
> >> For more options, visit https://groups.google.com/groups/opt_out.
> >
> >
> >
> >
> > --
> > Kévin Moulart
> > GSM France : +33 7 81 06 10 10
> > GSM Belgique : +32 473 85 23 85
> > Téléphone fixe : +32 2 771 88 45
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Spark Users" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to [email protected].
> > For more options, visit https://groups.google.com/groups/opt_out.
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "Spark Users" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/spark-users/T1soH67C5M4/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>



-- 
Kévin Moulart
GSM France : +33 7 81 06 10 10
GSM Belgique : +32 473 85 23 85
Téléphone fixe : +32 2 771 88 45

-- 
You received this message because you are subscribed to the Google Groups 
"Spark Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: Spark compilation with CDH 4.5.0

Reply via email to