Execute command jps in shell to see whether namenode and jobtracker is running correctly.
On Fri, Aug 27, 2010 at 9:49 AM, rahul <rmalv...@apple.com> wrote: > Hi Jeff, > > I transferred the hadoop conf files to the pig/conf location but still i get > the same error. > > Does the issue is with the configuration files or with the hdfs files system ? > > Can test the connection to hdfs(localhost/127.0.0.1:9001) in some way ? > > Steps I did : > > 1. I have formatted initially my local file system using the ./hadoop > namenode -format command. I believe this mounts the local file system to HDFS. > 2. Then I configured the hadoop conf files and started ./start-all script. > 3. Started Pig with a custom pig script which should read hdfs as I passed > the HADOOP_CONF_DIR as parameter. > The command was java -cp $PIGDIR/pig.jar:$HADOOP_CONF_DIR org.apache.pig.Main > script1-hadoop.pig > > Please let me know if these step miss something ? > > Thanks, > Rahul > > > On Aug 26, 2010, at 6:33 PM, Jeff Zhang wrote: > >> Try to put the hadoop xml configuration file to pig/conf folder >> >> >> >> On Thu, Aug 26, 2010 at 6:22 PM, rahul <rmalv...@apple.com> wrote: >>> Hi Jeff, >>> >>> I have set the hadoop conf in class path by setting $HADOOP_CONF_DIR >>> variable. >>> >>> But I have both Pig and hadoop running at the same machine, so localhost >>> should not make a difference. >>> >>> So I have used all the default config setting for the core-site.xml, >>> hdfs-site.xml, mapred-site.xml, as per the hadoop tutorial. >>> >>> Please let me know if my understanding is correct ? >>> >>> I am attaching the conf files as well : >>> hdfs-site.xml: >>> >>> <?xml version="1.0"?> >>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> >>> >>> <!-- Put site-specific property overrides in this file. --> >>> >>> <configuration> >>> <property> >>> <name>fs.default.name</name> >>> <value>hdfs://localhost:9000</value> >>> <description>The name of the default file system. A URI whose >>> scheme and authority determine the FileSystem implementation. The >>> uri's scheme determines the config property (fs.SCHEME.impl) naming >>> the FileSystem implementation class. The uri's authority is used to >>> determine the host, port, etc. for a filesystem.</description> >>> </property> >>> >>> <property> >>> <name>dfs.replication</name> >>> <value>1</value> >>> <description>Default block replication. >>> The actual number of replications can be specified when the file is >>> created. >>> The default is used if replication is not specified in create time. >>> </description> >>> </property> >>> >>> </configuration> >>> >>> core-site.xml >>> <?xml version="1.0"?> >>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> >>> >>> <!-- Put site-specific property overrides in this file. --> >>> >>> <configuration> >>> <property> >>> <name>hadoop.tmp.dir</name> >>> <value>/Users/rahulmalviya/Documents/Hadoop/hadoop-0.21.0/hadoop-${user.name}</value> >>> <description>A base for other temporary directories.</description> >>> </property> >>> </configuration> >>> >>> mapred-site.xml >>> <?xml version="1.0"?> >>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> >>> >>> <!-- Put site-specific property overrides in this file. --> >>> >>> <configuration> >>> <property> >>> <name>mapred.job.tracker</name> >>> <value>localhost:9001</value> >>> <description>The host and port that the MapReduce job tracker runs >>> at. If "local", then jobs are run in-process as a single map >>> and reduce task. >>> </description> >>> </property> >>> >>> <property> >>> <name>mapred.tasktracker.tasks.maximum</name> >>> <value>8</value> >>> <description>The maximum number of tasks that will be run simultaneously by >>> a >>> a task tracker >>> </description> >>> </property> >>> </configuration> >>> >>> Please let me know if there is a issue in my configurations ? Any input is >>> valuable for me. >>> >>> Thanks, >>> Rahul >>> >>> On Aug 26, 2010, at 6:10 PM, Jeff Zhang wrote: >>> >>>> Do you put the hadoop conf on classpath ? It seems you are still using >>>> local file system but conncect Hadoop's JobTracker. >>>> Make sure you set the correct configuration in core-site.xml >>>> hdfs-site.xml, mapred-site.xml, and put them on classpath. >>>> >>>> >>>> >>>> On Thu, Aug 26, 2010 at 5:32 PM, rahul <rmalv...@apple.com> wrote: >>>>> Hi , >>>>> >>>>> I am trying to integrate Pig with Hadoop for processing of jobs. >>>>> >>>>> I am able to run Pig in local mode and Hadoop with streaming api >>>>> perfectly. >>>>> >>>>> But when I try to run Pig with Hadoop I get follwong Error: >>>>> >>>>> Pig Stack Trace >>>>> --------------- >>>>> ERROR 2116: Unexpected error. Could not validate the output specification >>>>> for: file:///Users/rahulmalviya/Documents/Pig/dev/main_merged_hdp_out >>>>> >>>>> org.apache.pig.impl.plan.PlanValidationException: ERROR 0: An unexpected >>>>> exception caused the validation to stop >>>>> at >>>>> org.apache.pig.impl.plan.PlanValidator.validate(PlanValidator.java:56) >>>>> at >>>>> org.apache.pig.impl.logicalLayer.validators.InputOutputFileValidator.validate(InputOutputFileValidator.java:49) >>>>> at >>>>> org.apache.pig.impl.logicalLayer.validators.InputOutputFileValidator.validate(InputOutputFileValidator.java:37) >>>>> at >>>>> org.apache.pig.impl.logicalLayer.validators.LogicalPlanValidationExecutor.validate(LogicalPlanValidationExecutor.java:89) >>>>> at org.apache.pig.PigServer.validate(PigServer.java:930) >>>>> at org.apache.pig.PigServer.compileLp(PigServer.java:910) >>>>> at org.apache.pig.PigServer.compileLp(PigServer.java:871) >>>>> at org.apache.pig.PigServer.compileLp(PigServer.java:852) >>>>> at org.apache.pig.PigServer.execute(PigServer.java:816) >>>>> at org.apache.pig.PigServer.access$100(PigServer.java:105) >>>>> at org.apache.pig.PigServer$Graph.execute(PigServer.java:1080) >>>>> at org.apache.pig.PigServer.executeBatch(PigServer.java:288) >>>>> at >>>>> org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:109) >>>>> at >>>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166) >>>>> at >>>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:138) >>>>> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:89) >>>>> at org.apache.pig.Main.main(Main.java:391) >>>>> Caused by: org.apache.pig.impl.plan.PlanValidationException: ERROR 2116: >>>>> Unexpected error. Could not validate the output specification for: >>>>> file:///Users/rahulmalviya/Documents/Pig/dev/main_merged_hdp_out >>>>> at >>>>> org.apache.pig.impl.logicalLayer.validators.InputOutputFileVisitor.visit(InputOutputFileVisitor.java:93) >>>>> at org.apache.pig.impl.logicalLayer.LOStore.visit(LOStore.java:140) >>>>> at org.apache.pig.impl.logicalLayer.LOStore.visit(LOStore.java:37) >>>>> at >>>>> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:67) >>>>> at >>>>> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69) >>>>> at >>>>> org.apache.pig.impl.plan.DepthFirstWalker.depthFirst(DepthFirstWalker.java:69) >>>>> at >>>>> org.apache.pig.impl.plan.DepthFirstWalker.walk(DepthFirstWalker.java:50) >>>>> at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) >>>>> at >>>>> org.apache.pig.impl.plan.PlanValidator.validate(PlanValidator.java:50) >>>>> ... 16 more >>>>> Caused by: java.io.IOException: Call to localhost/127.0.0.1:9001 failed >>>>> on local exception: java.io.EOFException >>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:775) >>>>> at org.apache.hadoop.ipc.Client.call(Client.java:743) >>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) >>>>> at org.apache.hadoop.mapred.$Proxy0.getProtocolVersion(Unknown >>>>> Source) >>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) >>>>> at >>>>> org.apache.hadoop.mapred.JobClient.createRPCProxy(JobClient.java:429) >>>>> at org.apache.hadoop.mapred.JobClient.init(JobClient.java:423) >>>>> at org.apache.hadoop.mapred.JobClient.<init>(JobClient.java:410) >>>>> at org.apache.hadoop.mapreduce.Job.<init>(Job.java:50) >>>>> at >>>>> org.apache.pig.impl.logicalLayer.validators.InputOutputFileVisitor.visit(InputOutputFileVisitor.java:89) >>>>> ... 24 more >>>>> Caused by: java.io.EOFException >>>>> at java.io.DataInputStream.readInt(DataInputStream.java:375) >>>>> at >>>>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501) >>>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446) >>>>> ================================================================================ >>>>> >>>>> Did anyone got the same error. I think it related to connection between >>>>> pig and hadoop. >>>>> >>>>> Can someone tell me how to connect Pig and hadoop. >>>>> >>>>> Thanks. >>>>> >>>> >>>> >>>> >>>> -- >>>> Best Regards >>>> >>>> Jeff Zhang >>> >>> >> >> >> >> -- >> Best Regards >> >> Jeff Zhang > > -- Best Regards Jeff Zhang