Hm, nothing jumps out. Definitely the error you are getting indicates that
somehow the preprocessor is trying to substitute a variable called PIGDIR in
your script. Which is odd. Does the same thing happen if you try bin/pig -f
test_lite.pig? If yes, try running with -secretDebugCmd (shh, it's secret),
and sending along the output. Why are you using pig-core instead of pig.jar
that ant should be generating for you when you build? Where did you get it,
how did you build it, and what's the output of cksum on it?
The fact that now your 0.5 is broken too makes me think that maybe your
symlinks are messed up. Surely a change to one jar shouldn't be affecting a
totally unrelated jar in a directory the first jar doesn't know about.

Sorry about the barrage of questions, I am just a bit dumbfounded about how
this could even begin to start happening. Any preprocessor experts around?

-D


On Tue, Feb 16, 2010 at 3:37 AM, Alex Parvulescu
<[email protected]>wrote:

> Hello
>
> And thanks again for all your help
>
> I have a symbolic link /home/alex/hadoop/pig which points to
> /home/alex/hadoop/pig-branch-0.6-ro - this is a checkout and build of the
> 0.6 branch.
> I'm running the script from /home/alex/hadoop/test.
>
> I didn't touch pig.properties, also I'm running with the default pig
> script.
>
> The only thing I did is I copied the pig/build/pig-0.6.0-dev.jar
> to pig/pig-0.6.0-dev-core.jar. Because otherwise it would not
> work(Exception
> in thread "main" java.lang.NoClassDefFoundError: org/apache/pig/Main).
>
> I'm running with a fresh 0.6 build and still no luck.
>
> The info you reqested:
>
> a...@alex-desktop:~/hadoop/test$ pwd
> /home/alex/hadoop/test
> a...@alex-desktop:~/hadoop/test$ printenv | grep JAVA
> JAVA_HOME=/usr/lib/jvm/java-6-sun
> a...@alex-desktop:~/hadoop/test$ printenv | grep PIG
> PIGDIR=/home/alex/hadoop/pig
> a...@alex-desktop:~/hadoop/test$ printenv | grep HADOOP
> HADOOP_HOME=/home/alex/hadoop/hadoop
> HADOOP_CONF_DIR=/home/alex/hadoop/hadoop/conf
> HADOOPDIR=/home/alex/hadoop/hadoop/conf
>
> a...@alex-desktop:~/hadoop/test$ cat
> /home/alex/hadoop/pig/conf/pig.properties
> # Pig configuration file. All values can be overwritten by command line
> arguments.
> # see bin/pig -help
>
> # log4jconf log4j configuration file
> # log4jconf=./conf/log4j.properties
>
> # brief logging (no timestamps)
> brief=false
>
> # clustername, name of the hadoop jobtracker. If no port is defined port
> 50020 will be used.
> #cluster
>
> #debug level, INFO is default
> debug=INFO
>
> # a file that contains pig script
> #file=
>
> # load jarfile, colon separated
> #jar=
>
> #verbose print all log messages to screen (default to print only INFO and
> above to screen)
> verbose=false
>
> #exectype local|mapreduce, mapreduce is default
> #exectype=mapreduce
> # hod realted properties
> #ssh.gateway
> #hod.expect.root
> #hod.expect.uselatest
> #hod.command
> #hod.config.dir
> #hod.param
>
>
> #Do not spill temp files smaller than this size (bytes)
> pig.spill.size.threshold=5000000
> #EXPERIMENT: Activate garbage collection when spilling a file bigger than
> this size (bytes)
> #This should help reduce the number of files being spilled.
> pig.spill.gc.activation.size=40000000
>
>
> ######################
> # Everything below this line is Yahoo specific.  Note that I've made
> # (almost) no changes to the lines above to make merging in from Apache
> # easier.  Any values I don't want from above I override below.
> #
> # This file is configured for use with HOD on the production clusters.  If
> you
> # want to run pig with a static cluster you will need to remove everything
> # below this line and set the cluster value (above) to the
> # hostname and port of your job tracker.
>
> exectype=mapreduce
>
> hod.config.dir=/export/crawlspace/kryptonite/hod/current/conf
> hod.server=local
>
> cluster.domain=inktomisearch.com
>
> log.file=
>
> yinst.cluster=kryptonite
>
>
>
> And now boom!
>
> java -cp $PIGDIR/pig-0.6.0-dev-core.jar:$HADOOPDIR org.apache.pig.Main
> test_lite.pig
>
>
> 2010-02-16 12:27:17,283 [main] INFO  org.apache.pig.Main - Logging error
> messages to: /home/alex/hadoop/test/pig_1266319637282.log
> 2010-02-16 12:27:17,303 [main] ERROR org.apache.pig.Main - ERROR 2999:
> Unexpected internal error. Undefined parameter : PIGDIR
> Details at logfile: /home/alex/hadoop/test/pig_1266319637282.log
>
> The log file:
> a...@alex-desktop:~/hadoop/test$ cat pig_1266319637282.log
> Error before Pig is launched
> ----------------------------
> ERROR 2999: Unexpected internal error. Undefined parameter : PIGDIR
>
> java.lang.RuntimeException: Undefined parameter : PIGDIR
>    at
>
> org.apache.pig.tools.parameters.PreprocessorContext.substitute(PreprocessorContext.java:232)
>    at
>
> org.apache.pig.tools.parameters.ParameterSubstitutionPreprocessor.parsePigFile(ParameterSubstitutionPreprocessor.java:106)
>    at
>
> org.apache.pig.tools.parameters.ParameterSubstitutionPreprocessor.genSubstitutedFile(ParameterSubstitutionPreprocessor.java:86)
>    at org.apache.pig.Main.runParamPreprocessor(Main.java:515)
>    at org.apache.pig.Main.main(Main.java:366)
>
> ================================================================================
>
>
> The script: 'A = load '/home/alex/hadoop/test/t.csv' using PigStorage('\t')
> as (id: long); '
> The file: '1' .
> Simple enough :)
>
> I'm all out of ideas. It seems that even 0.5 is broken now. I can't start
> anything as scripts. If I go with manual processing in grunt (line by line)
> it's all good on any version.
>
> thanks,
> alex
>
>
> On Tue, Feb 16, 2010 at 9:58 AM, Dmitriy Ryaboy <[email protected]>
> wrote:
>
> > Pig starts, the error you are getting is from inside pig.Main .
> > I think something is getting messed up in your environment because you
> > are juggling too many different versions of pig (granted, I have 3 or
> > 4 in various stages of development on my laptop most of the time, and
> > haven't had your problems. But then neither of them has a hacked
> > bin/pig , with the exception of the cloudera one...). I've never tried
> > running multiple ant tests, either. There's a short "sanity" version
> > of tests, ant test-commit, that runs in under 10 minutes. You might
> > want to try that if you are not doing things like changing join
> > implementations or messing with the optimizer.
> >
> > Let's do this: send a full trace of what you are doing and what your
> > environment looks like.  Something like
> >
> > pwd
> > printenv | grep JAVA
> > printenv | grep PIG
> > printenv | grep HADOOP
> > cat conf/pig.properties
> > java -cp ........
> > <boom!>
> >
> > -D
> >
> >
> > On Tue, Feb 16, 2010 at 12:39 AM, Alex Parvulescu
> > <[email protected]> wrote:
> > > Hello,
> > >
> > > sorry for the delay, but I wanted to build from the source again just
> to
> > > make sure.
> > >
> > > The script is like this:
> > >
> > > 'A = load '/home/alex/hadoop/reviews/r.csv' using PigStorage('\t') as
> > (id:
> > > long, hid: long, locale: chararray, r1: int, r2: int, r3: int, r4:
> int);
> > '
> > >
> > > That's it. My guess is that Pig doesn't even start. Do you think I need
> > to
> > > change something in the properties file? I'm not sure anymore and I
> don't
> > > have a lot of luck going through the sources.
> > >
> > > And another thing, I tried running 'ant test' for both 0.6-branch and
> > trunk
> > > at the same time (because they take a very long time) and both test
> > scripts
> > > failed. I've switched to running them one after the other and they are
> > fine.
> > > Do you think that is ok?
> > >
> > > thanks for your time,
> > > alex
> > >
> > > On Fri, Feb 12, 2010 at 6:37 PM, Dmitriy Ryaboy <[email protected]>
> > wrote:
> > >
> > >> what does your script1-hadoop.pig look like?
> > >>
> > >> The error you are getting happens when the pig preprocessor can't
> > >> substitute Pig variables (the stuff you specify with -param and
> > >> %default, etc).  Do you have $PIGDIR in your script somewhere?
> > >>
> > >> -D
> > >>
> > >> On Fri, Feb 12, 2010 at 6:51 AM, Alex Parvulescu
> > >> <[email protected]> wrote:
> > >> > Hello,
> > >> >
> > >> > I seem to have broken my Pig install, and I don't know where to
> look.
> > >> >
> > >> > If I use directly the script (grunt) everything works ok, but every
> > time
> > >> I
> > >> > try to run a pig script:  'java -cp $PIGDIR/pig.jar:$HADOOPSITEPATH
> > >> > org.apache.pig.Main script1-hadoop.pig'
> > >> >
> > >> > I get this nice error: [main] ERROR org.apache.pig.Main - ERROR
> 2999:
> > >> > Unexpected internal error. Undefined parameter : PIGDIR
> > >> >
> > >> > Obviously I have the PIGDIR var set:
> > >> >> echo $PIGDIR
> > >> >> /home/alex/hadoop/pig
> > >> >
> > >> > This is something that I did, as I have used 0.5 and 0.6 and a
> patched
> > >> > version of 0.6 in parallel, but I can't figure out where to look.
> Any
> > >> > version I try to start now, gives the same error.
> > >> >
> > >> > any help would be greatly appreciated!
> > >> >
> > >> > alex
> > >> >
> > >>
> > >
> >
>

Reply via email to