Can't run penny in MapReduce mode --------------------------------- Key: PIG-2459 URL: https://issues.apache.org/jira/browse/PIG-2459 Project: Pig Issue Type: Bug Affects Versions: 0.9.1 Environment: Testing in a cluster with 8 node Ubuntu 11.10 64bit java 1.6 Hadoop 0.20.203 Reporter: Andy He Priority: Minor
First of all, the hadoop cluster is fine, and the pig program performs well. Yet, when I'm trying to run the penny tool in the MapReduce Mode with the command: _java -cp pig-0.9.1/contrib/penny/java/penny.jar:pig-0.9.1/pig-0.9.1.jar:$HADOOP_CONF_DIR org.apache.pig.penny.apps.ds.Main test.pig_ I get the following errors: =========== INFO executionengine.HExecutionEngine: Connecting to hadoop file system at: hdfs://lotr4.comp.polyu.edu.hk:9000 Exception in thread "main" java.lang.RuntimeException: Failed to create DataStorage at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:203) at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:118) at org.apache.pig.impl.PigContext.connect(PigContext.java:185) at org.apache.pig.PigServer.<init>(PigServer.java:244) at org.apache.pig.PigServer.<init>(PigServer.java:229) at org.apache.pig.tools.ToolsPigServer.<init>(ToolsPigServer.java:70) at org.apache.pig.penny.ParsedPigScript.<init>(ParsedPigScript.java:82) at org.apache.pig.penny.PennyServer.parse(PennyServer.java:44) at org.apache.pig.penny.apps.ds.Main.main(Main.java:36) Caused by: java.io.IOException: Call to lotr4.comp.polyu.edu.hk/158.132.10.162:9000 failed on local exception: java.io.EOFException at org.apache.hadoop.ipc.Client.wrapException(Client.java:775) at org.apache.hadoop.ipc.Client.call(Client.java:743) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) at $Proxy0.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359) at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95) at org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72) ... 10 more Caused by: java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501) at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446) =========== The test.pig script just loads a file and then stores it back: ----------- data = LOAD 'input/student'; STORE data INTO 'output'; ----------- Is it the problem that I miss some environmental variables for penny? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira