Re: InputFormat and InputSplit - Network location name contains /:
Do not use the InputSplit's getLocations() API to supply your file path, it is not intended for such things, if thats what you've done in your current InputFormat implementation. If you're looking to store a single file path, use the FileSplit class, or if not as simple as that, do use it as a base reference to build you Path based InputSplit derivative. Its sources are at https://github.com/apache/hadoop-common/blob/release-2.4.0/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/input/FileSplit.java. Look for the Writable method overrides in particular to understand how to use custom fields. On Thu, Apr 10, 2014 at 9:54 PM, Patcharee Thongtra wrote: > Hi, > > I wrote a custom InputFormat and InputSplit to handle netcdf file. I use > with a custom pig Load function. When I submitted a job by running a pig > script. I got an error below. From the error log, the network location name > is "hdfs://service-1-0.local:8020/user/patcharee/netcdf_data/wrfout_d02" - > my input file, containing "/", and hadoop does not allow. > > It could be something missing in my custom InputFormat and InputSplit. Any > ideas? Any help is appreciated, > > Patcharee > > > 2014-04-10 17:09:01,854 INFO [CommitterEvent Processor #0] > org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing > the event EventType: JOB_SETUP > > 2014-04-10 17:09:01,918 INFO [AsyncDispatcher event handler] > org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: > job_1387474594811_0071Job Transitioned from SETUP to RUNNING > > 2014-04-10 17:09:01,982 INFO [AsyncDispatcher event handler] > org.apache.hadoop.yarn.util.RackResolver: Resolved > hdfs://service-1-0.local:8020/user/patcharee/netcdf_data/wrfout_d02 to > /default-rack > > 2014-04-10 17:09:01,984 FATAL [AsyncDispatcher event handler] > org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread > java.lang.IllegalArgumentException: Network location name contains /: > hdfs://service-1-0.local:8020/user/patcharee/netcdf_data/wrfout_d02 > at org.apache.hadoop.net.NodeBase.set(NodeBase.java:87) > at org.apache.hadoop.net.NodeBase.(NodeBase.java:65) > at > org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:111) > at > org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:95) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.(TaskAttemptImpl.java:548) > at > org.apache.hadoop.mapred.MapTaskAttemptImpl.(MapTaskAttemptImpl.java:47) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.MapTaskImpl.createAttempt(MapTaskImpl.java:62) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.addAttempt(TaskImpl.java:594) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.addAndScheduleAttempt(TaskImpl.java:581) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.access$1300(TaskImpl.java:100) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl$InitialScheduleTransition.transition(TaskImpl.java:871) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl$InitialScheduleTransition.transition(TaskImpl.java:866) > at > org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) > at > org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) > at > org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) > at > org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:632) > at > org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:99) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:1237) > at > org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:1231) > at > org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134) > at > org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81) > at java.lang.Thread.run(Thread.java:662) > 2014-04-10 17:09:01,986 INFO [AsyncDispatcher event handler] > org.apache.hadoop. -- Harsh J
Re: download hadoop-2.4
The official release can be found on: http://www.apache.org/dyn/closer.cgi/hadoop/common/ But you can also choose to checkout the code from svn/git repository. On Thu, Apr 10, 2014 at 8:08 PM, Mingjiang Shi wrote: > http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.0/ > > > On Fri, Apr 11, 2014 at 10:23 AM, lei liu wrote: > >> Hadoop-2.4 is release, where can I download the hadoop-2.4 code from? >> >> >> Thanks, >> >> LiuLei >> > > > > -- > Cheers > -MJ > -- Zhijie Shen Hortonworks Inc. http://hortonworks.com/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: download hadoop-2.4
http://svn.apache.org/repos/asf/hadoop/common/tags/release-2.4.0/ On Fri, Apr 11, 2014 at 10:23 AM, lei liu wrote: > Hadoop-2.4 is release, where can I download the hadoop-2.4 code from? > > > Thanks, > > LiuLei > -- Cheers -MJ
download hadoop-2.4
Hadoop-2.4 is release, where can I download the hadoop-2.4 code from? Thanks, LiuLei
Re: how can i archive old data in HDFS?
AFAIK, no tools now. Regards, *Stanley Shi,* On Fri, Apr 11, 2014 at 9:09 AM, ch huang wrote: > hi,maillist: > how can i archive old data in HDFS ,i have lot of old data ,the > data will not be use ,but it take lot of space to store it ,i want to > archive and zip the old data, HDFS can do this operation? >
which dir in HDFS can be clean ?
hi,maillist: my HDFS cluster run about 1 year ,and i find many dir is very large,i wonder if some of them can be clean? like /var/log/hadoop-yarn/apps
Re: use setrep change number of file replicas,but not work
i set replica number from 3 to 2,but i dump NN metrics ,the PendingDeletionBlocks is zero ,why? if the check thread will sleep a interval then do it's check work ,how long the interval time is? On Thu, Apr 10, 2014 at 10:50 AM, Harsh J wrote: > The replica deletion is asynchronous. You can track its deletions via > the NameNode's over-replicated blocks and the pending delete metrics. > > On Thu, Apr 10, 2014 at 7:16 AM, ch huang wrote: > > hi,maillist: > > i try modify replica number on some dir but it seems not work > > ,anyone know why? > > > > # sudo -u hdfs hadoop fs -setrep -R 2 /user/hive/warehouse/mytest > > Replication 2 set: > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 > > > > the file still store 3 replica ,but the echo number changed > > # hadoop fs -ls /user/hive/warehouse/mytest/dsp_request/2014-01-26 > > Found 1 items > > -rw-r--r-- 2 hdfs hdfs 17660 2014-01-26 18:34 > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 > > > > # sudo -u hdfs hdfs fsck > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 -files > -blocks > > -locations > > Connecting to namenode via http://ch11:50070 > > FSCK started by hdfs (auth:SIMPLE) from /192.168.11.12 for path > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 at Thu Apr > 10 > > 09:39:51 CST 2014 > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 17660 > bytes, 1 > > block(s): OK > > 0. > > > BP-1043055049-192.168.11.11-1382442676609:blk_-9219869107960013037_1976591 > > len=17660 repl=3 [192.168.11.13:50010, 192.168.11.10:50010, > > 192.168.11.14:50010] > > > > i remove the file ,and upload new file ,as i understand ,the new file > should > > be stored in 2 replica,but it still store 3 replica ,why? > > # sudo -u hdfs hadoop fs -rm -r -skipTrash > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/* > > Deleted /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 > > # hadoop fs -put ./data_0 > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/ > > [root@ch12 ~]# hadoop fs -ls > > /user/hive/warehouse/mytest/dsp_request/2014-01-26 > > Found 1 items > > -rw-r--r-- 3 root hdfs 17660 2014-04-10 09:40 > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 > > # sudo -u hdfs hdfs fsck > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 -files > -blocks > > -locations > > Connecting to namenode via http://ch11:50070 > > FSCK started by hdfs (auth:SIMPLE) from /192.168.11.12 for path > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 at Thu Apr > 10 > > 09:41:12 CST 2014 > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 17660 > bytes, 1 > > block(s): OK > > 0. > BP-1043055049-192.168.11.11-1382442676609:blk_6517693524032437780_8889786 > > len=17660 repl=3 [192.168.11.12:50010, 192.168.11.15:50010, > > 192.168.11.13:50010] > > > > -- > Harsh J >
Re: use setrep change number of file replicas,but not work
i can use fsck to get Over-replicated blocks but how can i track pending delete ? On Thu, Apr 10, 2014 at 10:50 AM, Harsh J wrote: > The replica deletion is asynchronous. You can track its deletions via > the NameNode's over-replicated blocks and the pending delete metrics. > > On Thu, Apr 10, 2014 at 7:16 AM, ch huang wrote: > > hi,maillist: > > i try modify replica number on some dir but it seems not work > > ,anyone know why? > > > > # sudo -u hdfs hadoop fs -setrep -R 2 /user/hive/warehouse/mytest > > Replication 2 set: > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 > > > > the file still store 3 replica ,but the echo number changed > > # hadoop fs -ls /user/hive/warehouse/mytest/dsp_request/2014-01-26 > > Found 1 items > > -rw-r--r-- 2 hdfs hdfs 17660 2014-01-26 18:34 > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 > > > > # sudo -u hdfs hdfs fsck > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 -files > -blocks > > -locations > > Connecting to namenode via http://ch11:50070 > > FSCK started by hdfs (auth:SIMPLE) from /192.168.11.12 for path > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 at Thu Apr > 10 > > 09:39:51 CST 2014 > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 17660 > bytes, 1 > > block(s): OK > > 0. > > > BP-1043055049-192.168.11.11-1382442676609:blk_-9219869107960013037_1976591 > > len=17660 repl=3 [192.168.11.13:50010, 192.168.11.10:50010, > > 192.168.11.14:50010] > > > > i remove the file ,and upload new file ,as i understand ,the new file > should > > be stored in 2 replica,but it still store 3 replica ,why? > > # sudo -u hdfs hadoop fs -rm -r -skipTrash > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/* > > Deleted /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 > > # hadoop fs -put ./data_0 > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/ > > [root@ch12 ~]# hadoop fs -ls > > /user/hive/warehouse/mytest/dsp_request/2014-01-26 > > Found 1 items > > -rw-r--r-- 3 root hdfs 17660 2014-04-10 09:40 > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 > > # sudo -u hdfs hdfs fsck > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 -files > -blocks > > -locations > > Connecting to namenode via http://ch11:50070 > > FSCK started by hdfs (auth:SIMPLE) from /192.168.11.12 for path > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 at Thu Apr > 10 > > 09:41:12 CST 2014 > > /user/hive/warehouse/mytest/dsp_request/2014-01-26/data_0 17660 > bytes, 1 > > block(s): OK > > 0. > BP-1043055049-192.168.11.11-1382442676609:blk_6517693524032437780_8889786 > > len=17660 repl=3 [192.168.11.12:50010, 192.168.11.15:50010, > > 192.168.11.13:50010] > > > > -- > Harsh J >
how can i archive old data in HDFS?
hi,maillist: how can i archive old data in HDFS ,i have lot of old data ,the data will not be use ,but it take lot of space to store it ,i want to archive and zip the old data, HDFS can do this operation?
Re: hadoop 2.0 upgrade to 2.4
Motty, https://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Installation-Guide/CDH5-Installation-Guide.html provides instructions to upgrade from CDH4 to CDH5 (which bundles Hadoop 2.3.0). If you intention is to use CDH5 that should help you. If you have further questions about it, the right alias to use is cdh-u...@cloudera.org If your intention is to use Apache Hadoop 2.4.0, some of CDH documentation above may still be relevant. Thanks. On Thu, Apr 10, 2014 at 12:20 PM, motty cruz wrote: > Hi All, > > I currently have a hadoop 2.0 cluster in production, I want to upgrade to > latest release. > > current version: > [root@doop1 ~]# hadoop version > Hadoop 2.0.0-cdh4.6.0 > > Cluster has the following services: > hbase > hive > hue > impala > mapreduce > oozie > sqoop > zookeeper > > can someone point me to a howto upgrade hadoop from 2.0 to hadoop 2.4.0? > > Thanks in advance, > > -- Alejandro
hadoop 2.0 upgrade to 2.4
Hi All, I currently have a hadoop 2.0 cluster in production, I want to upgrade to latest release. current version: [root@doop1 ~]# hadoop version Hadoop 2.0.0-cdh4.6.0 Cluster has the following services: hbase hive hue impala mapreduce oozie sqoop zookeeper can someone point me to a howto upgrade hadoop from 2.0 to hadoop 2.4.0? Thanks in advance,
InputFormat and InputSplit - Network location name contains /:
Hi, I wrote a custom InputFormat and InputSplit to handle netcdf file. I use with a custom pig Load function. When I submitted a job by running a pig script. I got an error below. From the error log, the network location name is "hdfs://service-1-0.local:8020/user/patcharee/netcdf_data/wrfout_d02" - my input file, containing "/", and hadoop does not allow. It could be something missing in my custom InputFormat and InputSplit. Any ideas? Any help is appreciated, Patcharee 2014-04-10 17:09:01,854 INFO [CommitterEvent Processor #0] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_SETUP 2014-04-10 17:09:01,918 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1387474594811_0071Job Transitioned from SETUP to RUNNING 2014-04-10 17:09:01,982 INFO [AsyncDispatcher event handler] org.apache.hadoop.yarn.util.RackResolver: Resolved hdfs://service-1-0.local:8020/user/patcharee/netcdf_data/wrfout_d02 to /default-rack 2014-04-10 17:09:01,984 FATAL [AsyncDispatcher event handler] org.apache.hadoop.yarn.event.AsyncDispatcher: Error in dispatcher thread java.lang.IllegalArgumentException: Network location name contains /: hdfs://service-1-0.local:8020/user/patcharee/netcdf_data/wrfout_d02 at org.apache.hadoop.net.NodeBase.set(NodeBase.java:87) at org.apache.hadoop.net.NodeBase.(NodeBase.java:65) at org.apache.hadoop.yarn.util.RackResolver.coreResolve(RackResolver.java:111) at org.apache.hadoop.yarn.util.RackResolver.resolve(RackResolver.java:95) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl.(TaskAttemptImpl.java:548) at org.apache.hadoop.mapred.MapTaskAttemptImpl.(MapTaskAttemptImpl.java:47) at org.apache.hadoop.mapreduce.v2.app.job.impl.MapTaskImpl.createAttempt(MapTaskImpl.java:62) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.addAttempt(TaskImpl.java:594) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.addAndScheduleAttempt(TaskImpl.java:581) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.access$1300(TaskImpl.java:100) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl$InitialScheduleTransition.transition(TaskImpl.java:871) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl$InitialScheduleTransition.transition(TaskImpl.java:866) at org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362) at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:632) at org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl.handle(TaskImpl.java:99) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:1237) at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$TaskEventDispatcher.handle(MRAppMaster.java:1231) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81) at java.lang.Thread.run(Thread.java:662) 2014-04-10 17:09:01,986 INFO [AsyncDispatcher event handler] org.apache.hadoop.
Re: not able to run map reduce job example on aws machine
"java.lang.IllegalArgumentException: Does not contain a valid host:port authority: poc_hadoop04:46162" Hostnames cannot carry an underscore '_' character per RFC 952 and its extensions. Please fix your hostname to not carry one. On Thu, Apr 10, 2014 at 5:14 PM, Rahul Singh wrote: > here is my mapred.site.xml config > > > mapred.job.tracker > localhost:54311 > The host and port that the MapReduce job tracker runs > at. If "local", then jobs are run in-process as a single map > and reduce task. > > > > > Also, The job runs fine in memory, if i remove the dependency on yarn, i.e. > if i comment out: > >mapreduce.framework.name > yarn > > > in mapred-site.xml. > > > > > On Thu, Apr 10, 2014 at 4:43 PM, Kiran Dangeti > wrote: >> >> Rahul, >> >> Please check the port name given in mapred.site.xml >> Thanks >> Kiran >> >> On Thu, Apr 10, 2014 at 3:23 PM, Rahul Singh >> wrote: >>> >>> Hi, >>> I am getting following exception while running word count example, >>> >>> 14/04/10 15:17:09 INFO mapreduce.Job: Task Id : >>> attempt_1397123038665_0001_m_00_2, Status : FAILED >>> Container launch failed for container_1397123038665_0001_01_04 : >>> java.lang.IllegalArgumentException: Does not contain a valid host:port >>> authority: poc_hadoop04:46162 >>> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:211) >>> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:163) >>> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:152) >>> at >>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:210) >>> at >>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) >>> at >>> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) >>> at >>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) >>> at >>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) >>> at >>> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >>> at >>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >>> at java.lang.Thread.run(Thread.java:662) >>> >>> >>> I have everything configured with hdfs running where i am able to create >>> files and directories. running jps on my machine shows all components >>> running. >>> >>> 10290 NameNode >>> 10416 DataNode >>> 10738 ResourceManager >>> 11634 Jps >>> 10584 SecondaryNameNode >>> 10844 NodeManager >>> >>> >>> Any pointers will be appreciated. >>> >>> Thanks and Regards, >>> -Rahul Singh >> >> > -- Harsh J
Re: not able to run map reduce job example on aws machine
here is my mapred.site.xml config mapred.job.tracker localhost:54311 The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. Also, The job runs fine in memory, if i remove the dependency on yarn, i.e. if i comment out: mapreduce.framework.name yarn in mapred-site.xml. On Thu, Apr 10, 2014 at 4:43 PM, Kiran Dangeti wrote: > Rahul, > > Please check the port name given in mapred.site.xml > Thanks > Kiran > > On Thu, Apr 10, 2014 at 3:23 PM, Rahul Singh > wrote: > >> Hi, >> I am getting following exception while running word count example, >> >> 14/04/10 15:17:09 INFO mapreduce.Job: Task Id : >> attempt_1397123038665_0001_m_00_2, Status : FAILED >> Container launch failed for container_1397123038665_0001_01_04 : >> java.lang.IllegalArgumentException: Does not contain a valid host:port >> authority: poc_hadoop04:46162 >> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:211) >> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:163) >> at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:152) >> at >> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:210) >> at >> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) >> at >> org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) >> at >> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) >> at >> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) >> at >> org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.java:662) >> >> >> I have everything configured with hdfs running where i am able to create >> files and directories. running jps on my machine shows all components >> running. >> >> 10290 NameNode >> 10416 DataNode >> 10738 ResourceManager >> 11634 Jps >> 10584 SecondaryNameNode >> 10844 NodeManager >> >> >> Any pointers will be appreciated. >> >> Thanks and Regards, >> -Rahul Singh >> > >
Re: File requests to Namenode
Thanks !!! Diwakar Sent from my iPhone > On Apr 9, 2014, at 9:22 PM, Harsh J wrote: > > You could look at metrics the NN publishes, or look at/process the > HDFS audit log. > >> On Wed, Apr 9, 2014 at 6:36 PM, Diwakar Sharma >> wrote: >> How and where to check how many datanode block address requests a namenode >> gets when running a map reduce job. >> >> - Diwakar > > > > -- > Harsh J
Re: not able to run map reduce job example on aws machine
Rahul, Please check the port name given in mapred.site.xml Thanks Kiran On Thu, Apr 10, 2014 at 3:23 PM, Rahul Singh wrote: > Hi, > I am getting following exception while running word count example, > > 14/04/10 15:17:09 INFO mapreduce.Job: Task Id : > attempt_1397123038665_0001_m_00_2, Status : FAILED > Container launch failed for container_1397123038665_0001_01_04 : > java.lang.IllegalArgumentException: Does not contain a valid host:port > authority: poc_hadoop04:46162 > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:211) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:163) > at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:152) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:210) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) > at > org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) > at > org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > > > I have everything configured with hdfs running where i am able to create > files and directories. running jps on my machine shows all components > running. > > 10290 NameNode > 10416 DataNode > 10738 ResourceManager > 11634 Jps > 10584 SecondaryNameNode > 10844 NodeManager > > > Any pointers will be appreciated. > > Thanks and Regards, > -Rahul Singh >
not able to run map reduce job example on aws machine
Hi, I am getting following exception while running word count example, 14/04/10 15:17:09 INFO mapreduce.Job: Task Id : attempt_1397123038665_0001_m_00_2, Status : FAILED Container launch failed for container_1397123038665_0001_01_04 : java.lang.IllegalArgumentException: Does not contain a valid host:port authority: poc_hadoop04:46162 at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:211) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:163) at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:152) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.newProxy(ContainerManagementProtocolProxy.java:210) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy$ContainerManagementProtocolProxyData.(ContainerManagementProtocolProxy.java:196) at org.apache.hadoop.yarn.client.api.impl.ContainerManagementProtocolProxy.getProxy(ContainerManagementProtocolProxy.java:117) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl.getCMProxy(ContainerLauncherImpl.java:403) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$Container.launch(ContainerLauncherImpl.java:138) at org.apache.hadoop.mapreduce.v2.app.launcher.ContainerLauncherImpl$EventProcessor.run(ContainerLauncherImpl.java:369) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) I have everything configured with hdfs running where i am able to create files and directories. running jps on my machine shows all components running. 10290 NameNode 10416 DataNode 10738 ResourceManager 11634 Jps 10584 SecondaryNameNode 10844 NodeManager Any pointers will be appreciated. Thanks and Regards, -Rahul Singh