Hi, I deployed Hadoop 2.4 on AWS EC2 using S3 native file system as a replacement of HDFS. I tried several example apps, all gave me the following stack tracing msgs (an older thread on Jul 24 hang there w/o being resolved... So I attach the DEBUG info here...):
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar wordcount s3n://mybkt/wc/ s3n://mybkt/out 14/08/12 21:57:35 DEBUG util.Shell: setsid exited with exit code 0 14/08/12 21:57:36 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of successful kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops) 14/08/12 21:57:36 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[Rate of failed kerberos logins and latency (milliseconds)], about=, type=DEFAULT, always=false, sampleName=Ops) 14/08/12 21:57:36 DEBUG lib.MutableMetricsFactory: field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, value=[GetGroups], about=, type=DEFAULT, always=false, sampleName=Ops) 14/08/12 21:57:36 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group related metrics 14/08/12 21:57:36 DEBUG util.KerberosName: Kerberos krb5 configuration not found, setting default realm to empty 14/08/12 21:57:36 DEBUG security.Groups: Creating new Groups object 14/08/12 21:57:36 DEBUG util.NativeCodeLoader: Trying to load the custom-built native-hadoop library... 14/08/12 21:57:36 DEBUG util.NativeCodeLoader: Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path 14/08/12 21:57:36 DEBUG util.NativeCodeLoader: java.library.path=/home/ubuntu/hadoop-2.4.0/lib 14/08/12 21:57:36 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/08/12 21:57:36 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Falling back to shell based 14/08/12 21:57:36 DEBUG security.JniBasedUnixGroupsMappingWithFallback: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping 14/08/12 21:57:36 DEBUG security.Groups: Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback; cacheTimeout=300000; warningDeltaMs=5000 14/08/12 21:57:36 DEBUG security.UserGroupInformation: hadoop login 14/08/12 21:57:36 DEBUG security.UserGroupInformation: hadoop login commit 14/08/12 21:57:36 DEBUG security.UserGroupInformation: using local user:UnixPrincipal: ubuntu 14/08/12 21:57:36 DEBUG security.UserGroupInformation: UGI loginUser:ubuntu (auth:SIMPLE) 14/08/12 21:57:36 DEBUG service.Jets3tProperties: s3service.https-only=true 14/08/12 21:57:36 DEBUG service.Jets3tProperties: storage-service.internal-error-retry-max=5 14/08/12 21:57:36 DEBUG service.Jets3tProperties: http.connection-manager.factory-class-name=org.jets3t.service.utils.RestUtils$ConnManagerFactory 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.connection-timeout-ms=60000 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.socket-timeout-ms=60000 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.stale-checking-enabled=true 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.useragent=null 14/08/12 21:57:36 DEBUG utils.RestUtils: Setting user agent string: JetS3t/0.9.0 (Linux/3.13.0-29-generic; amd64; en; JVM 1.7.0_55) 14/08/12 21:57:36 DEBUG service.Jets3tProperties: http.protocol.expect-continue=true 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.connection-manager-timeout=0 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.retry-max=5 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.proxy-autodetect=true 14/08/12 21:57:36 DEBUG service.Jets3tProperties: s3service.s3-endpoint= s3.amazonaws.com 14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: About to attempt auto proxy detection under Java version:1.7.0_55-b14 14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: Sun Plugin reported java version not 1.3.X, 1.4.X, 1.5.X or 1.6.X - trying failover detection... 14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: Using failover proxy detection... 14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: Plugin Proxy Config List Property:null 14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: No configured plugin proxy list 14/08/12 21:57:36 DEBUG service.Jets3tProperties: s3service.default-storage-class=null 14/08/12 21:57:36 DEBUG service.Jets3tProperties: s3service.server-side-encryption=null 14/08/12 21:57:36 DEBUG service.Jets3tProperties: http.connection-manager.factory-class-name=org.jets3t.service.utils.RestUtils$ConnManagerFactory 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.connection-timeout-ms=60000 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.socket-timeout-ms=60000 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.stale-checking-enabled=true 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.useragent=null 14/08/12 21:57:36 DEBUG utils.RestUtils: Setting user agent string: JetS3t/0.9.0 (Linux/3.13.0-29-generic; amd64; en; JVM 1.7.0_55) 14/08/12 21:57:36 DEBUG service.Jets3tProperties: http.protocol.expect-continue=true 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.connection-manager-timeout=0 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.retry-max=5 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.proxy-autodetect=true 14/08/12 21:57:36 DEBUG service.Jets3tProperties: s3service.s3-endpoint= s3.amazonaws.com 14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: About to attempt auto proxy detection under Java version:1.7.0_55-b14 14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: Sun Plugin reported java version not 1.3.X, 1.4.X, 1.5.X or 1.6.X - trying failover detection... 14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: Using failover proxy detection... 14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: Plugin Proxy Config List Property:null 14/08/12 21:57:36 DEBUG proxy.PluginProxyUtil: No configured plugin proxy list 14/08/12 21:57:36 DEBUG service.Jets3tProperties: devpay.user-token=null 14/08/12 21:57:36 DEBUG service.Jets3tProperties: devpay.product-token=null 14/08/12 21:57:36 DEBUG service.Jets3tProperties: httpclient.requester-pays-buckets-enabled=false 14/08/12 21:57:36 DEBUG security.UserGroupInformation: PrivilegedAction as:ubuntu (auth:SIMPLE) from:org.apache.hadoop.mapreduce.Job.connect(Job.java:1250) 14/08/12 21:57:36 DEBUG mapreduce.Cluster: Trying ClientProtocolProvider : org.apache.hadoop.mapred.YarnClientProtocolProvider 14/08/12 21:57:36 DEBUG service.AbstractService: Service: org.apache.hadoop.mapred.ResourceMgrDelegate entered state INITED 14/08/12 21:57:36 DEBUG service.AbstractService: Service: org.apache.hadoop.yarn.client.api.impl.YarnClientImpl entered state INITED 14/08/12 21:57:37 INFO client.RMProxy: Connecting to ResourceManager at / 172.31.20.187:8032 14/08/12 21:57:37 DEBUG security.UserGroupInformation: PrivilegedAction as:ubuntu (auth:SIMPLE) from:org.apache.hadoop.yarn.client.RMProxy.getProxy(RMProxy.java:130) 14/08/12 21:57:37 DEBUG ipc.YarnRPC: Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC 14/08/12 21:57:37 DEBUG ipc.HadoopYarnProtoRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ApplicationClientProtocol 14/08/12 21:57:37 DEBUG ipc.Server: rpcKind=RPC_PROTOCOL_BUFFER, rpcRequestWrapperClass=class org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper, rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@7d66036e 14/08/12 21:57:37 DEBUG ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@71cebfd2 14/08/12 21:57:37 DEBUG service.AbstractService: Service org.apache.hadoop.yarn.client.api.impl.YarnClientImpl is started 14/08/12 21:57:37 DEBUG service.AbstractService: Service org.apache.hadoop.mapred.ResourceMgrDelegate is started 14/08/12 21:57:37 DEBUG security.UserGroupInformation: PrivilegedAction as:ubuntu (auth:SIMPLE) from:org.apache.hadoop.fs.FileContext.getAbstractFileSystem(FileContext.java:330) 14/08/12 21:57:37 DEBUG security.UserGroupInformation: PrivilegedActionException as:ubuntu (auth:SIMPLE) cause:org.apache.hadoop.fs.UnsupportedFileSystemException: No AbstractFileSystem for scheme: s3n 14/08/12 21:57:37 INFO mapreduce.Cluster: Failed to use org.apache.hadoop.mapred.YarnClientProtocolProvider due to error: Error in instantiating YarnClient 14/08/12 21:57:37 DEBUG mapreduce.Cluster: Trying ClientProtocolProvider : org.apache.hadoop.mapred.LocalClientProtocolProvider 14/08/12 21:57:37 DEBUG mapreduce.Cluster: Cannot pick org.apache.hadoop.mapred.LocalClientProtocolProvider as the ClientProtocolProvider - returned null protocol 14/08/12 21:57:37 DEBUG security.UserGroupInformation: PrivilegedActionException as:ubuntu (auth:SIMPLE) cause:java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. java.io.IOException: Cannot initialize Cluster. Please check your configuration for mapreduce.framework.name and the correspond server addresses. at org.apache.hadoop.mapreduce.Cluster.initialize(Cluster.java:120) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:82) at org.apache.hadoop.mapreduce.Cluster.<init>(Cluster.java:75) at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1255) at org.apache.hadoop.mapreduce.Job$9.run(Job.java:1251) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) at org.apache.hadoop.mapreduce.Job.connect(Job.java:1250) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1279) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1303) at org.apache.hadoop.examples.WordCount.main(WordCount.java:84) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212) Here are my config files: yarn-site.xml: <configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>172.31.20.187:8032</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>172.31.20.187:8031</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>172.31.20.187:8030</value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>/home/ubuntu/hdfs/tmp</value> </property> </configuration> mapred-site.xml: <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.map.memory.mb</name> <value>640</value> <description>Larger resource limit for maps.</description> </property> <property> <name>mapreduce.map.java.opts</name> <value>-Xmx768m</value> <description>Heap-size for child jvms of maps.</description> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>640</value> <description>Larger resource limit for reduces.</description> </property> <property> <name>mapreduce.reduce.java.opts</name> <value>-Xmx768m</value> <description>Heap-size for child jvms of reduces.</description> </property> <property> <name>mapreduce.jobtracker.address</name> <value>172.31.20.187:8021</value> </property> </configuration> I also followed this link for configuration of AWS S3's access control (core-site.xml): https://wiki.apache.org/hadoop/AmazonS3 core-site.xml: <property> <name>fs.defaultFS</name> <value>s3n://mybkt</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>fs.s3n.awsAccessKeyId</name> <value>123</value> </property> <property> <name>fs.s3n.awsSecretAccessKey</name> <value>456</value> </property> I tried Hadoop v1 as well and s3n file system works for wordcount. But it seems it doesn't work for Hadoop v2. Please help. Thanks, Yue