Apache Pig does not work on Amazon's Elastic MapReduce
------------------------------------------------------
Key: PIG-2562
URL: https://issues.apache.org/jira/browse/PIG-2562
Project: Pig
Issue Type: Bug
Affects Versions: 0.9.1, 0.9.2, 0.10, 0.11
Environment: Amazon Elastic MapReduce
Reporter: Russell Jurney
Priority: Critical
See https://forums.aws.amazon.com/thread.jspa?messageID=323063
According to this thread, only Amazon's proprietary hadoop-core.jar enables S3
to work on with Pig. Apache Pig does not work.
Example:
Apache Pig branch-0.9 as of today:
hadoop@ip-10-195-159-114:~$ pig/bin/pig
grunt> cd s3://elasticmapreduce/samples/pig-apache/input/
2012-02-29 05:45:22,282 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR
2999: Unexpected internal error. This file system object
(hdfs://10.195.159.114:9000) does not support access to the request path
's3://elasticmapreduce/samples/pig-apache/input' You possibly called
FileSystem.get(conf) when you should have called FileSystem.get(uri, conf) to
obtain a file system supporting your path.
Details at logfile: /home/hadoop/pig_1330494091268.log
grunt> quit
EMR's Pig as of today:
hadoop@ip-10-195-159-114:~$ pig
2012-02-29 05:45:35,626 [main] INFO org.apache.pig.Main - Logging error
messages to: /home/hadoop/pig_1330494335621.log
2012-02-29 05:45:35,841 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to
hadoop file system at: hdfs://10.195.159.114:9000
2012-02-29 05:45:36,200 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to
map-reduce job tracker at: 10.195.159.114:9001
grunt> cd s3://elasticmapreduce/samples/pig-apache/input/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira