Apache Pig does not work on Amazon's Elastic MapReduce ------------------------------------------------------
Key: PIG-2562 URL: https://issues.apache.org/jira/browse/PIG-2562 Project: Pig Issue Type: Bug Affects Versions: 0.9.1, 0.9.2, 0.10, 0.11 Environment: Amazon Elastic MapReduce Reporter: Russell Jurney Priority: Critical See https://forums.aws.amazon.com/thread.jspa?messageID=323063 According to this thread, only Amazon's proprietary hadoop-core.jar enables S3 to work on with Pig. Apache Pig does not work. Example: Apache Pig branch-0.9 as of today: hadoop@ip-10-195-159-114:~$ pig/bin/pig grunt> cd s3://elasticmapreduce/samples/pig-apache/input/ 2012-02-29 05:45:22,282 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: Unexpected internal error. This file system object (hdfs://10.195.159.114:9000) does not support access to the request path 's3://elasticmapreduce/samples/pig-apache/input' You possibly called FileSystem.get(conf) when you should have called FileSystem.get(uri, conf) to obtain a file system supporting your path. Details at logfile: /home/hadoop/pig_1330494091268.log grunt> quit EMR's Pig as of today: hadoop@ip-10-195-159-114:~$ pig 2012-02-29 05:45:35,626 [main] INFO org.apache.pig.Main - Logging error messages to: /home/hadoop/pig_1330494335621.log 2012-02-29 05:45:35,841 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://10.195.159.114:9000 2012-02-29 05:45:36,200 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: 10.195.159.114:9001 grunt> cd s3://elasticmapreduce/samples/pig-apache/input/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira