Re: Invalid ContainerId ... Caused by: java.lang.NumberFormatException: For input string: e04

2015-03-24 Thread Steve Loughran

 On 24 Mar 2015, at 02:10, Marcelo Vanzin van...@cloudera.com wrote:
 
 This happens most probably because the Spark 1.3 you have downloaded
 is built against an older version of the Hadoop libraries than those
 used by CDH, and those libraries cannot parse the container IDs
 generated by CDH.


This sounds suspiciously like the changes in YARN for HA (the epoch number) 
isn't being parsed by older versions of the YARN client libs. This is 
effectively a regression in the YARN code -its creating container IDs that 
can't be easily parsed by old apps. It may be possible to fix that spark-side 
by having its own parser for the YARN container/app environment variable

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Invalid ContainerId ... Caused by: java.lang.NumberFormatException: For input string: e04

2015-03-24 Thread Manoj Samel
Thanks Marcelo - I was using the SBT built spark per earlier thread. I
switched now to the distro (with the conf changes for CDH path in front)
and guava issue is gone.

Thanks,

On Tue, Mar 24, 2015 at 1:50 PM, Marcelo Vanzin van...@cloudera.com wrote:

 Hi there,

 On Tue, Mar 24, 2015 at 1:40 PM, Manoj Samel manojsamelt...@gmail.com
 wrote:
  When I run any query, it gives java.lang.NoSuchMethodError:
 
 com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;

 Are you running a custom-compiled Spark by any chance? Specifically,
 one you built with sbt? That would hit this problem, because the path
 I suggested (/usr/lib/hadoop/client/*) contains an older guava
 library, which would override the one shipped with the sbt-built
 Spark.

 If you build Spark with maven, or use the pre-built Spark distro, or
 specifically filter out the guava jar from your classpath when setting
 up the Spark job, things should work.

 --
 Marcelo

 --

 ---
 You received this message because you are subscribed to the Google Groups
 CDH Users group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to cdh-user+unsubscr...@cloudera.org.
 For more options, visit https://groups.google.com/a/cloudera.org/d/optout.



Re: Invalid ContainerId ... Caused by: java.lang.NumberFormatException: For input string: e04

2015-03-24 Thread Sandy Ryza
Steve, that's correct, but the problem only shows up when different
versions of the YARN jars are included on the classpath.

-Sandy

On Tue, Mar 24, 2015 at 6:29 AM, Steve Loughran ste...@hortonworks.com
wrote:


  On 24 Mar 2015, at 02:10, Marcelo Vanzin van...@cloudera.com wrote:
 
  This happens most probably because the Spark 1.3 you have downloaded
  is built against an older version of the Hadoop libraries than those
  used by CDH, and those libraries cannot parse the container IDs
  generated by CDH.


 This sounds suspiciously like the changes in YARN for HA (the epoch
 number) isn't being parsed by older versions of the YARN client libs. This
 is effectively a regression in the YARN code -its creating container IDs
 that can't be easily parsed by old apps. It may be possible to fix that
 spark-side by having its own parser for the YARN container/app environment
 variable

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: Invalid ContainerId ... Caused by: java.lang.NumberFormatException: For input string: e04

2015-03-24 Thread Manoj Samel
Thanks All - perhaps I misread the earlier posts as dependencies with
Hadoop version, but the key is also the CDH 5.3.2 (not just Hadoop 2.5 v/s
2.4) etc.

After adding the classPath as Marcelo/Harsh suggested (loading CDH libs
front), I am able to get spark-shell started without invalid container etc
so that issue is solved.

When I run any query, it gives java.lang.NoSuchMethodError:
com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;

This seems to be Guava lib version issue that has been known ... I will
look into it.

Thanks again !

On Tue, Mar 24, 2015 at 12:50 PM, Harsh J ha...@cloudera.com wrote:

 My comment's still the same: Runtime-link-via-classpath Spark to use CDH
 5.3.2 libraries, just like your cluster does, not Apache Hadoop 2.5.0
 (which CDH is merely based on, but carries several backports on top that
 aren't in Apache Hadoop 2.5.0, one of which addresses this parsing trouble).

 You do not require to recompile Spark, just alter its hadoop libraries in
 its classpath to be that of CDH server version (overwrite from parcels,
 etc.).

 On Wed, Mar 25, 2015 at 1:06 AM, Manoj Samel manojsamelt...@gmail.com
 wrote:

 I recompiled Spark 1.3 with Hadoop 2.5; it still gives same stack trace.

 A quick browse into  stacktrace with Hadoop 2.5.0
 org.apache.hadoop.yarn.util.ConverterUtils ...

 1. toContainerId gets parameter containerId which I assume is container_
 *e*06_1427223073530_0001_01_01
 2. It splits it using public static final Splitter _SPLITTER =
 Splitter.on('_').trimResults();
 3. Line 172 checks container prefix with CONTAINER_PREFIX which is valid
 (container)
 4. It calls toApplicationAttemptId
 5. toApplicationAttemptId tries Long.parseLong(it.next()) on e06 and
 dies

 Seems like it is not expecting a non-numeric character. Is this a Yarn
 issue ?

 Thanks,

 On Tue, Mar 24, 2015 at 8:25 AM, Manoj Samel manoj.sa...@gmail.com
 wrote:

 I'll compile Spark with Hadoop libraries and try again ...

 Thanks,

 Manoj

 On Mar 23, 2015, at 10:34 PM, Harsh J ha...@cloudera.com wrote:

 This may happen if you are using different versions of CDH5 jars between
 Spark and the cluster. Can you ensure your Spark's Hadoop CDH jars match
 the cluster version exactly, since you seem to be using a custom version of
 Spark (out of CDH) here?

 On Tue, Mar 24, 2015 at 7:32 AM, Manoj Samel manojsamelt...@gmail.com
 wrote:

 x-post to CDH list for any insight ...

 Thanks,

 -- Forwarded message --
 From: Manoj Samel manojsamelt...@gmail.com
 Date: Mon, Mar 23, 2015 at 6:32 PM
 Subject: Invalid ContainerId ... Caused by:
 java.lang.NumberFormatException: For input string: e04
 To: user@spark.apache.org user@spark.apache.org


 Spark 1.3, CDH 5.3.2, Kerberos

 Setup works fine with base configuration, spark-shell can be used in
 yarn client mode etc.

 When work recovery feature is enabled via
 http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/admin_ha_yarn_work_preserving_recovery.html,
 the spark-shell fails with following log

 15/03/24 01:20:16 ERROR yarn.ApplicationMaster: Uncaught exception:
 java.lang.IllegalArgumentException: Invalid ContainerId:
 container_e04_1427159778706_0002_01_01
 at
 org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:182)
 at
 org.apache.spark.deploy.yarn.YarnRMClient.getAttemptId(YarnRMClient.scala:93)
 at
 org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:83)
 at
 org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:576)
 at
 org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60)
 at
 org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:59)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at
 org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59)
 at
 org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:574)
 at
 org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:597)
 at
 org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
 Caused by: java.lang.NumberFormatException: For input string: e04
 at
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
 at java.lang.Long.parseLong(Long.java:589)
 at java.lang.Long.parseLong(Long.java:631)
 at
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationAttemptId(ConverterUtils.java:137)
 at
 org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:177)
 ... 12 more
 15/03/24 01:20:16 INFO yarn.ApplicationMaster: Final app status:
 FAILED, exitCode: 10, (reason: 

Re: Invalid ContainerId ... Caused by: java.lang.NumberFormatException: For input string: e04

2015-03-24 Thread Marcelo Vanzin
Hi there,

On Tue, Mar 24, 2015 at 1:40 PM, Manoj Samel manojsamelt...@gmail.com wrote:
 When I run any query, it gives java.lang.NoSuchMethodError:
 com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;

Are you running a custom-compiled Spark by any chance? Specifically,
one you built with sbt? That would hit this problem, because the path
I suggested (/usr/lib/hadoop/client/*) contains an older guava
library, which would override the one shipped with the sbt-built
Spark.

If you build Spark with maven, or use the pre-built Spark distro, or
specifically filter out the guava jar from your classpath when setting
up the Spark job, things should work.

-- 
Marcelo

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Invalid ContainerId ... Caused by: java.lang.NumberFormatException: For input string: e04

2015-03-23 Thread Marcelo Vanzin
This happens most probably because the Spark 1.3 you have downloaded
is built against an older version of the Hadoop libraries than those
used by CDH, and those libraries cannot parse the container IDs
generated by CDH.

You can try to work around this by manually adding CDH jars to the
front of the classpath by setting spark.driver.extraClassPath and
spark.executor.extraClassPath to /usr/lib/hadoop/client/* (or the
respective location if you're using parcels).


On Mon, Mar 23, 2015 at 6:32 PM, Manoj Samel manojsamelt...@gmail.com wrote:
 Spark 1.3, CDH 5.3.2, Kerberos

 Setup works fine with base configuration, spark-shell can be used in yarn
 client mode etc.

 When work recovery feature is enabled via
 http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/admin_ha_yarn_work_preserving_recovery.html,
 the spark-shell fails with following log

 15/03/24 01:20:16 ERROR yarn.ApplicationMaster: Uncaught exception:
 java.lang.IllegalArgumentException: Invalid ContainerId:
 container_e04_1427159778706_0002_01_01
 at
 org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:182)
 at
 org.apache.spark.deploy.yarn.YarnRMClient.getAttemptId(YarnRMClient.scala:93)
 at
 org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:83)
 at
 org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:576)
 at
 org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:60)
 at
 org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:59)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
 at
 org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:59)
 at
 org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:574)
 at
 org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:597)
 at
 org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
 Caused by: java.lang.NumberFormatException: For input string: e04
 at
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
 at java.lang.Long.parseLong(Long.java:589)
 at java.lang.Long.parseLong(Long.java:631)
 at
 org.apache.hadoop.yarn.util.ConverterUtils.toApplicationAttemptId(ConverterUtils.java:137)
 at
 org.apache.hadoop.yarn.util.ConverterUtils.toContainerId(ConverterUtils.java:177)
 ... 12 more
 15/03/24 01:20:16 INFO yarn.ApplicationMaster: Final app status: FAILED,
 exitCode: 10, (reason: Uncaught exception: Invalid ContainerId:
 container_e04_1427159778706_0002_01_01)





-- 
Marcelo

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org