Re: Support for Hadoop 2.2

Claudio Romo Otto Tue, 26 Nov 2013 12:31:03 -0800

Hi Juan,

In a nutshell, you must pay attention to memory settings insidemapred-site.xml, yarn-site.xml, hadoop-env.sh and yarn-env.sh, so youhave to design a memory distribution strategy according to yourperformance requirements. In this way you will have, among other things,enough memory for the Scheduler.

Remember to reserve at least 600 - 800 mb for the operative system toavoid OOM errors.


Best regards
El 26/11/13 16:07, Juan Martin Pampliega escribió:

Hi Claudio,

It would be nice to know which were the settings that you had to tune to
get this. I am having a similar issue with some jobs that I am running.
Thanks,
Juan.


On Wed, Oct 30, 2013 at 7:40 PM, Claudio Romo Otto <
[email protected]> wrote:

Jarcec, finally I got solved this problem by learning more on hadoop 2
(lot of reading), and then tuning some settings to let the work move from
the SCHEDULED state. With this said, the last problem was only concerning
on hadoop.

Thanks for your support!

El 30/10/13 18:03, Jarek Jarcec Cecho escribió:

  Hi Claudio,

it's hard to guess from the limited information. I would suggest to take
a look into logs to see what is happening.

One guess though - You've mentioned that the task was "running" for 30
minutes, but it still seems to be in SCHEDULED time - are your node
managers correctly running?

Jarcec

On Fri, Oct 25, 2013 at 04:10:12PM -0300, Claudio Romo Otto wrote:

You got it!

The solution was to compile with  -Dhadoopversion=23 option. After
your message I tried another test removing Cassandra from the chain
and Pig sent successfully the job to hadoop.

BUT! the problem changed, now the Map task remains forever stuck on
Hadoop (30 minutes waiting, no other jobs running):

Task

Progress

State

Start Time

Finish Time

Elapsed Time
task_1382631533263_0012_m_000000 <http://topgps-test-3.
dnsalias.com:8088/proxy/application_1382631533263_
0012/mapreduce/task/task_1382631533263_0012_m_000000>

         SCHEDULED       Fri, 25 Oct 2013 18:18:32 GMT   N/A     0sec



Attempt

Progress

State

Node

Logs

Started

Finished

Elapsed

Note
attempt_1382631533263_0012_m_000000_0   0,00    STARTING        N/A
N/A     N/A
N/A     0sec


Don't know if this is a Hadoop problem or Pig, what do you think?


El 25/10/13 13:11, Jarek Jarcec Cecho escribió:

It seems that Pig was correctly compiled against Hadoop 23, but the
Cassandra piece was not, check out the where the exception is coming from:

  Caused by: java.lang.IncompatibleClassChangeError: Found interface

org.apache.hadoop.mapreduce.JobContext, but class was expected
      at org.apache.cassandra.hadoop.AbstractColumnFamilyInputForma
t.getSplits(AbstractColumnFamilyInputFormat.java:113)

So, I would say that you also need to get Hadoop 2 compatible Cassandra
connector first.

Jarcec

On Thu, Oct 24, 2013 at 10:34:49PM -0300, Claudio Romo Otto wrote:

After change from hadoop20 to hadoop23 the warning dissapeared but I
got the same exception (Found interface
org.apache.hadoop.mapreduce.JobContext, but class was expected)

I have tried over a fresh install: hadoop 2.2.0 and pig 0.12.1
compiled by me, no other product nor configuration, just two
servers, one master with ResourceManager and NameNode, one slave
with DataNode and NodeManager.

I can't understand why over this fresh cluster Pig 0.12 fails. Here
is the new trace:

2013-10-24 16:10:52,351 [JobControl] ERROR
org.apache.pig.backend.hadoop23.PigJobControl - Error while trying
to run jobs.
java.lang.RuntimeException: java.lang.reflect.
InvocationTargetException
      at org.apache.pig.backend.hadoop23.PigJobControl.submit(
PigJobControl.java:130)
      at org.apache.pig.backend.hadoop23.PigJobControl.run(
PigJobControl.java:191)
      at java.lang.Thread.run(Thread.java:724)
      at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.
MapReduceLauncher$1.run(MapReduceLauncher.java:257)
Caused by: java.lang.reflect.InvocationTargetException
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(
NativeMethodAccessorImpl.java:57)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(
DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at org.apache.pig.backend.hadoop23.PigJobControl.submit(
PigJobControl.java:128)
      ... 3 more
Caused by: java.lang.IncompatibleClassChangeError: Found interface
org.apache.hadoop.mapreduce.JobContext, but class was expected
      at org.apache.cassandra.hadoop.AbstractColumnFamilyInputForma
t.getSplits(AbstractColumnFamilyInputFormat.java:113)
      at org.apache.pig.backend.hadoop.executionengine.
mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:274)
      at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(
JobSubmitter.java:491)
      at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(
JobSubmitter.java:508)
      at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(
JobSubmitter.java:392)
      at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268)
      at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265)
      at java.security.AccessController.doPrivileged(Native Method)
      at javax.security.auth.Subject.doAs(Subject.java:415)
      at org.apache.hadoop.security.UserGroupInformation.doAs(
UserGroupInformation.java:1491)
      at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265)
      at org.apache.hadoop.mapreduce.lib.jobcontrol.ControlledJob.
submit(ControlledJob.java:335)
      ... 8 more


El 24/10/13 21:33, Prashant Kommireddi escribió:

Yes it does. You need to recompile Pig for hadoop 2

ant clean jar-withouthadoop -Dhadoopversion=23


On Thu, Oct 24, 2013 at 5:37 AM, Claudio Romo Otto <
[email protected]> wrote:

  Does Pig support Hadoop 2.2? When I try Pig 0.12 and Hadoop 2.2 I

get an
error even with simple operations like

data = LOAD 'cql://keyspace1/testcf?' USING CqlStorage();
dump data;

I only got a warning first and then and exception:

2013-10-24 09:35:19,300 [main] WARN org.apache.pig.backend.**
hadoop20.PigJobControl
- falling back to default JobControl (not using hadoop 0.20 ?)
java.lang.**NoSuchFieldException: runnerState
      at java.lang.Class.**getDeclaredField(Class.java:**1938)
      at org.apache.pig.backend.**hadoop20.PigJobControl.<**
clinit>(PigJobControl.java:51)
      at org.apache.pig.backend.hadoop.**executionengine.shims.**
HadoopShims.newJobControl(**HadoopShims.java:97)
      at org.apache.pig.backend.hadoop.**executionengine.**
mapReduceLayer.**
JobControlCompiler.compile(**JobControlCompiler.java:285)
      at org.apache.pig.backend.hadoop.**executionengine.**
mapReduceLayer.**
MapReduceLauncher.launchPig(**MapReduceLauncher.java:177)
      at org.apache.pig.PigServer.**launchPlan(PigServer.java:**1264)
      at org.apache.pig.PigServer.**executeCompiledLogicalPlan(**
PigServer.java:1249)
      at org.apache.pig.PigServer.**storeEx(PigServer.java:931)
      at org.apache.pig.PigServer.**store(PigServer.java:898)
      at org.apache.pig.PigServer.**openIterator(PigServer.java:**
811)
      at org.apache.pig.tools.grunt.**GruntParser.processDump(**
GruntParser.java:696)
      at org.apache.pig.tools.**pigscript.parser.**
PigScriptParser.parse(**
PigScriptParser.java:320)
      at org.apache.pig.tools.grunt.**GruntParser.parseStopOnError(**
GruntParser.java:194)
      at org.apache.pig.tools.grunt.**GruntParser.parseStopOnError(**
GruntParser.java:170)
      at org.apache.pig.tools.grunt.**Grunt.run(Grunt.java:69)
      at org.apache.pig.Main.run(Main.**java:538)
      at org.apache.pig.Main.main(Main.**java:157)
      at sun.reflect.**NativeMethodAccessorImpl.**invoke0(Native
Method)
      at sun.reflect.**NativeMethodAccessorImpl.**invoke(**
NativeMethodAccessorImpl.java:**57)
      at sun.reflect.**DelegatingMethodAccessorImpl.**invoke(**
DelegatingMethodAccessorImpl.**java:43)
      at java.lang.reflect.Method.**invoke(Method.java:606)
      at org.apache.hadoop.util.RunJar.**main(RunJar.java:212)

------------------------------**--------

Backend error message during job submission
------------------------------**-------------
Unexpected System Error Occured: java.lang.**
IncompatibleClassChangeError:
Found interface org.apache.hadoop.mapreduce.**JobContext, but class
was
expected
          at org.apache.pig.backend.hadoop.**executionengine.**
mapReduceLayer.**PigOutputFormat.**setupUdfEnvAndStores(**
PigOutputFormat.java:225)
          at org.apache.pig.backend.hadoop.**executionengine.**
mapReduceLayer.**PigOutputFormat.**checkOutputSpecs(**
PigOutputFormat.java:186)
          at org.apache.hadoop.mapreduce.**JobSubmitter.checkSpecs(**
JobSubmitter.java:456)
          at org.apache.hadoop.mapreduce.**JobSubmitter.**
submitJobInternal(
**JobSubmitter.java:342)
          at org.apache.hadoop.mapreduce.**Job$10.run(Job.java:1268)
          at org.apache.hadoop.mapreduce.**Job$10.run(Job.java:1265)
          at java.security.**AccessController.doPrivileged(**Native
Method)
          at javax.security.auth.Subject.**doAs(Subject.java:415)
          at org.apache.hadoop.security.**
UserGroupInformation.doAs(**
UserGroupInformation.java:**1491)
          at org.apache.hadoop.mapreduce.**Job.submit(Job.java:1265)
          at org.apache.hadoop.mapreduce.**
lib.jobcontrol.ControlledJob.**
submit(ControlledJob.java:335)
          at org.apache.hadoop.mapreduce.**
lib.jobcontrol.JobControl.run(**
JobControl.java:240)
          at org.apache.pig.backend.**hadoop20.PigJobControl.run(**
PigJobControl.java:121)
          at java.lang.Thread.run(Thread.**java:724)
          at org.apache.pig.backend.hadoop.**executionengine.**
mapReduceLayer.**MapReduceLauncher$1.run(**
MapReduceLauncher.java:257)

Pig Stack Trace
---------------
ERROR 1066: Unable to open iterator for alias data

org.apache.pig.impl.**logicalLayer.**FrontendException: ERROR 1066:
Unable to open iterator for alias data
          at org.apache.pig.PigServer.**
openIterator(PigServer.java:**836)
          at org.apache.pig.tools.grunt.**GruntParser.processDump(**
GruntParser.java:696)
          at org.apache.pig.tools.**pigscript.parser.**
PigScriptParser.parse(**PigScriptParser.java:320)
          at org.apache.pig.tools.grunt.**
GruntParser.parseStopOnError(**
GruntParser.java:194)
          at org.apache.pig.tools.grunt.**
GruntParser.parseStopOnError(**
GruntParser.java:170)
          at org.apache.pig.tools.grunt.**Grunt.run(Grunt.java:69)
          at org.apache.pig.Main.run(Main.**java:538)
          at org.apache.pig.Main.main(Main.**java:157)
          at sun.reflect.**NativeMethodAccessorImpl.**invoke0(Native
Method)
          at sun.reflect.**NativeMethodAccessorImpl.**invoke(**
NativeMethodAccessorImpl.java:**57)
          at sun.reflect.**DelegatingMethodAccessorImpl.**invoke(**
DelegatingMethodAccessorImpl.**java:43)
          at java.lang.reflect.Method.**invoke(Method.java:606)
          at org.apache.hadoop.util.RunJar.**main(RunJar.java:212)
Caused by: java.io.IOException: Job terminated with anomalous status
FAILED
          at org.apache.pig.PigServer.**
openIterator(PigServer.java:**828)
          ... 12 more

Re: Support for Hadoop 2.2

Reply via email to