Usually Hadoop is used within a distro. Those can be cloudera, hortonworks, emr, etc 11 янв. 2014 г. 2:05 пользователь "Mariano Kamp" <mariano.k...@gmail.com> написал:
> Hi Josh. > > Ok, got it. Interesting. > > Downloaded ant, recompiled and now it works. > > Thank you. > > > On Fri, Jan 10, 2014 at 10:16 PM, Josh Elser <josh.el...@gmail.com> wrote: > > > Mariano, > > > > Pig 0.12.0 does work with Hadoop-2.2.0, but you need to recompile Pig > > first. > > > > In your $PIG_HOME, run the following command to rebuild Pig: > > > > `ant clean jar-withouthadoop -Dhadoopversion=23` > > > > Then, try re-running your script. > > > > > > On 1/9/14, 5:13 PM, Mariano Kamp wrote: > > > >> Hi, > >> > >> I am trying to run the first pig sample from the Hadoop - The definitive > >> guide - book. > >> > >> Unfortunately that doesn't work for me. > >> > >> I downloaded 0.12.0 and got the impression it should work with Hadoop > 2.2. > >> > >> http://pig.apache.org/releases.html#14+October%2C+ > >>> 2013%3A+release+0.12.0+available > >>> 14 October, 2013: release 0.12.0 available > >>> This release include several new features such as ASSERT operator, > >>> Streaming UDF, new AvroStorage, IN/CASE >operator, > BigInteger/BigDecimal > >>> data type, support for Windows. > >>> Note > >>> This release works with Hadoop 0.20.X, 1.X, 0.23.X and 2.X > >>> > >> > >> I use Hadoop 2.x. > >> > >>> snow:bin mkamp$ which hadoop > >>> /Users/mkamp/hadoop-2.2.0/bin//hadoop > >>> > >> > >> snow:bin mkamp$ echo $HADOOP_HOME > >>> /Users/mkamp/hadoop-2.2.0 > >>> > >> > >> But no matter if HADOOP_HOME is set or not I get a couple of errors and > >> it doesn't work if I run the script: > >> > >> records = LOAD 'micro-tab/sample.txt' > >>> AS (year:chararray, temperature:int, quality:int); > >>> DUMP records; > >>> > >> > >> All hell breaks lose and there is a lot of output, but most seems > >> meaningless, warnings about settings that are deprecated in Hadoop, but > >> still delivered by default this way. > >> > >> Hard to say what is relevant. Here are some excerpts, full output > >> attached as file. > >> > >> From the logfile: > >> > >> Unexpected System Error Occured: > java.lang.IncompatibleClassChangeError: > >>> Found interface >org.apache.hadoop.mapreduce.JobContext, but class was > >>> expected > >>> at >org.apache.pig.backend.hadoop.executionengine.mapReduceLayer. > >>> PigOutputFormat.setupUdfEnvAndStores(PigO>utputFormat.java:225) > >>> > >> > >> > >> ERROR 1066: Unable to open iterator for alias records > >>> From the console: > >>> > >> > >> 2014-01-09 22:24:45,976 [main] WARN > org.apache.pig.backend.hadoop20.PigJobControl > >>> - falling back to default >JobControl (not using hadoop 0.20 ?) > >>> java.lang.NoSuchFieldException: runnerState > >>> at java.lang.Class.getDeclaredField(Class.java:1918) > >>> > >> > >> But as a little googling indicated, this is business as usual? > >> > >> 2014-01-09 22:24:49,228 [JobControl] ERROR > org.apache.hadoop.mapreduce.lib.jobcontrol.JobControl > >>> - Error >while trying to run jobs. > >>> java.lang.IncompatibleClassChangeError: Found interface > >>> org.apache.hadoop.mapreduce.JobContext, but class >was expected > >>> > >> > >> Input(s): > >>> Failed to read data from "hdfs://localhost/user/mkamp/ > >>> micro-tab/sample.txt" > >>> > >> > >> That last one looks interesting. Maybe I am using it wrong and the > >> reported errors are not related? I wanted to read from the local file > >> system. > >> > >> So I also changed the script to read from hdfs, but that didn't change > >> the error. > >> > >> Any ideas where to go from here? > >> > >> Is it possible to run the latest Hadoop binary download and the latest > >> Pig binary download together? > >> > >> >