Re: about hadoop-2.2.0 mapred.child.java.opts
Actually, its the other way around (thanks Sandy for catching this error in my post). The presence of mapreduce.map|reduce.java.opts overrides mapred.child.java.opts, not the other way round as I had stated earlier (below). On Wed, Dec 4, 2013 at 1:28 PM, Harsh J ha...@cloudera.com wrote: Yes but the old property is yet to be entirely removed (removal of configs is graceful). These properties were introduced to provide more fine-tuned way to configure each type of task separately, but the older value continues to be accepted if present; the current behaviour is that if the MR runtime finds mapred.child.java.opts configured, it will override values of mapreduce.map|reduce.java.opts configs. To configure mapreduce.map|reduce.java.opts therefore, you should make sure you aren't passing mapred.child.java.opts (which is also no longer in the mapred-default.xml intentionally). On Wed, Dec 4, 2013 at 12:56 PM, Henry Hung ythu...@winbond.com wrote: Hello, I have a question. Is it correct to say that in hadoop-2.2.0, the mapred-site.xml node mapred.child.java.opts” is replaced by two new node “mapreduce.map.java.opts” and “mapreduce.reduce.java.opts”? Best regards, Henry The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond. -- Harsh J -- Harsh J
Aw: Re: mapreduce.jobtracker.expire.trackers.interval no effect
Hi adam, in our enviroment it does not matter what i insert there it always take over 600 seconds. I tried 3 and the resulte was the same. Regards Hansi Gesendet: Dienstag, 03. Dezember 2013 um 19:23 Uhr Von: Adam Kawa kawa.a...@gmail.com An: user@hadoop.apache.org Betreff: Re: mapreduce.jobtracker.expire.trackers.interval no effect I did a small test, and I a setting mapred.tasktracker.expiry.interval=6 worked for me (TT became considered as lost after around 66 seconds). Can the formula be: mapred.tasktracker.expiry.interval + 2 * some-heartbeat-interval-that-is-3-sec-by-default? Otherwise, is the 6 sec some kind of time needed to make a decision to consider TT as lost? 2013/12/3 Hansi Klose hansi.kl...@web.deI forget to say that we use Cloudera 2.0.0-mr1-cdh4.2.0 Gesendet: Dienstag, 03. Dezember 2013 um 17:38 Uhr Von: Hansi Klose hansi.kl...@web.de[hansi.kl...@web.de] An: user@hadoop.apache.org[user@hadoop.apache.org] Betreff: mapreduce.jobtracker.expire.trackers.interval no effect Hi, we want to set the heartbeat timout for a tasktracker. If the tasktracker does not send heartbeats for 60 seconds he should be marked as lost. I found the parameter mapreduce.jobtracker.expire.trackers.interval which sounds right to me. I set property namemapreduce.jobtracker.expire.trackers.interval/name value6/value /property in the mapred-site.xml on all servers and restarted the jobtracker and all tasktrackers. I started a benchmark hadoop jar hadoop-examples.jar randomwriter rand and every tasktracker gets 2 jobs. It is a small test environment. On one tasktracker i stopped the network. On the jobtracker i could see the Seconds since heartbeat increasing. But after 60 seconds the tasktracker was still in the overview. Even in the log of the jobtracker I found nothing. After over 600 seconds i found the message org.apache.hadoop.mapred.JobTracker: Lost tracker . And the tasktracker wasn't shown any more on the jobtracker. Isn't this the right setting? Regards Hansi
Re: Client mapred tries to renew a token with renewer specified as nobody
Well, that does not seem to be the issue. The Kerberos ticket gets refreshed automatically, but the delegation token doesn't. Le 3 déc. 2013 à 20:24, Raviteja Chirala a écrit : Alternatively you can schedule a cron job to do kinit every 20 hours or so. Just to renew token before it expires.
RE: about hadoop-2.2.0 mapred.child.java.opts
@Harsh J Thank you, I intend to upgrade from Hadoop 1.0.4 and this kind of information is very helpful. Best regards, Henry -Original Message- From: Harsh J [mailto:ha...@cloudera.com] Sent: Wednesday, December 04, 2013 4:20 PM To: user@hadoop.apache.org Subject: Re: about hadoop-2.2.0 mapred.child.java.opts Actually, its the other way around (thanks Sandy for catching this error in my post). The presence of mapreduce.map|reduce.java.opts overrides mapred.child.java.opts, not the other way round as I had stated earlier (below). On Wed, Dec 4, 2013 at 1:28 PM, Harsh J ha...@cloudera.com wrote: Yes but the old property is yet to be entirely removed (removal of configs is graceful). These properties were introduced to provide more fine-tuned way to configure each type of task separately, but the older value continues to be accepted if present; the current behaviour is that if the MR runtime finds mapred.child.java.opts configured, it will override values of mapreduce.map|reduce.java.opts configs. To configure mapreduce.map|reduce.java.opts therefore, you should make sure you aren't passing mapred.child.java.opts (which is also no longer in the mapred-default.xml intentionally). On Wed, Dec 4, 2013 at 12:56 PM, Henry Hung ythu...@winbond.com wrote: Hello, I have a question. Is it correct to say that in hadoop-2.2.0, the mapred-site.xml node mapred.child.java.opts is replaced by two new node mapreduce.map.java.opts and mapreduce.reduce.java.opts? Best regards, Henry The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond. -- Harsh J -- Harsh J The privileged confidential information contained in this email is intended for use only by the addressees as indicated by the original sender of this email. If you are not the addressee indicated in this email or are not responsible for delivery of the email to such a person, please kindly reply to the sender indicating this fact and delete all copies of it from your computer and network server immediately. Your cooperation is highly appreciated. It is advised that any unauthorized use of confidential information of Winbond is strictly prohibited; and any information in this email irrelevant to the official business of Winbond shall be deemed as neither given nor endorsed by Winbond.
Aw: mapreduce.jobtracker.expire.trackers.interval no effect
Hi. i think i found the reason. I looked at the job.xml and found the parameter mapred.tasktracker.expiry.interval 600 and mapreduce.jobtracker.expire.trackers.interval 3 So i tried the deprecated parameter mapred.tasktracker.expiry.interval in my configuration and voila it works! Why they write that the parameter is deprecated when the new one is not working and will be overwritten by the old one with the default value? http://archive.cloudera.com/cdh4/cdh/4/hadoop-2.0.0-cdh4.2.0/hadoop-project-dist/hadoop-common/DeprecatedProperties.html Regards Hansi Gesendet: Dienstag, 03. Dezember 2013 um 17:38 Uhr Von: Hansi Klose hansi.kl...@web.de An: user@hadoop.apache.org Betreff: mapreduce.jobtracker.expire.trackers.interval no effect Hi, we want to set the heartbeat timout for a tasktracker. If the tasktracker does not send heartbeats for 60 seconds he should be marked as lost. I found the parameter mapreduce.jobtracker.expire.trackers.interval which sounds right to me. I set property namemapreduce.jobtracker.expire.trackers.interval/name value6/value /property in the mapred-site.xml on all servers and restarted the jobtracker and all tasktrackers. I started a benchmark hadoop jar hadoop-examples.jar randomwriter rand and every tasktracker gets 2 jobs. It is a small test environment. On one tasktracker i stopped the network. On the jobtracker i could see the Seconds since heartbeat increasing. But after 60 seconds the tasktracker was still in the overview. Even in the log of the jobtracker I found nothing. After over 600 seconds i found the message org.apache.hadoop.mapred.JobTracker: Lost tracker . And the tasktracker wasn't shown any more on the jobtracker. Isn't this the right setting? Regards Hansi
Check compression codec of an HDFS file
What's the best way to check the compression codec that an HDFS file was written with? We use both Gzip and Snappy compression so I want a way to determine how a specific file is compressed. The closest I found is the *getCodec http://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/compress/CompressionCodecFactory.html#getCodec(org.apache.hadoop.fs.Path) *but that relies on the file name suffix ... which don't exist since Reducers typically don't add a suffix to the filenames they create. Thanks
Re: issue about capacity scheduler
if i have 40GB memory of cluster resource, and yarn.scheduler.capacity.maximum-am-resource-percent set to 0.1 ,so that's mean when i lauch a appMaster ,i need allocate 4GB to the appMaster ? ,if so, why i increasing the value will cause more appMaster running concurrently,instead of decreasing ? On Thu, Dec 5, 2013 at 5:04 AM, Jian He j...@hortonworks.com wrote: you can probably try increasing yarn.scheduler.capacity.maximum-am-resource-percent, This controls the max concurrently running AMs. Thanks, Jian On Wed, Dec 4, 2013 at 1:33 AM, ch huang justlo...@gmail.com wrote: hi,maillist : i use yarn framework and capacity scheduler ,and i have two queue ,one for hive and the other for big MR job in hive queue it's work fine,because hive task is very faster ,but what i think is user A submitted two big MR job ,so first big job eat all the resource belongs to the queue ,the other big MR job should wait until first job finished ,how can i let the same user 's MR job can run parallel? CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: issue about capacity scheduler
another question is ,i set the yarn.scheduler.minimum-allocation-mb is 2GB,so the container size will at less 2GB ,but i see appMaster container only use 1GB heap size why? # ps -ef|grep 8062 yarn 8062 8047 5 09:04 ?00:00:09 /usr/java/jdk1.7.0_25/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=/data/mrlocal/1/yarn/logs/application_1386139114497_0024/container_1386139114497_0024_01_01 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1024m org.apache.hadoop.mapreduce.v2.app.MRAppMaster On Thu, Dec 5, 2013 at 5:04 AM, Jian He j...@hortonworks.com wrote: you can probably try increasing yarn.scheduler.capacity.maximum-am-resource-percent, This controls the max concurrently running AMs. Thanks, Jian On Wed, Dec 4, 2013 at 1:33 AM, ch huang justlo...@gmail.com wrote: hi,maillist : i use yarn framework and capacity scheduler ,and i have two queue ,one for hive and the other for big MR job in hive queue it's work fine,because hive task is very faster ,but what i think is user A submitted two big MR job ,so first big job eat all the resource belongs to the queue ,the other big MR job should wait until first job finished ,how can i let the same user 's MR job can run parallel? CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Container [pid=22885,containerID=container_1386156666044_0001_01_000013] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 332.5 GB of 8 GB virtual memo
we have already tried several values of these two parameters, but it seems no use. 2013/12/5 Tsuyoshi OZAWA ozawa.tsuyo...@gmail.com Hi, Please check the properties like mapreduce.reduce.memory.mb and mapredce.map.memory.mb in mapred-site.xml. These properties decide resource limits for mappers/reducers. On Wed, Dec 4, 2013 at 10:16 PM, panfei cnwe...@gmail.com wrote: -- Forwarded message -- From: panfei cnwe...@gmail.com Date: 2013/12/4 Subject: Container [pid=22885,containerID=container_138615044_0001_01_13] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 332.5 GB of 8 GB virtual memory used. Killing container. To: CDH Users cdh-u...@cloudera.org Hi All: We are using CDH4.5 Hadoop for our production, when submit some (not all) jobs from hive, we get the following exception info , seems the physical memory and virtual memory both not enough for the job to run: Task with the most failures(4): - Task ID: task_138615044_0001_m_00 URL: http://namenode-1:8088/taskdetails.jsp?jobid=job_138615044_0001tipid=task_138615044_0001_m_00 - Diagnostic Messages for this Task: Container [pid=22885,containerID=container_138615044_0001_01_13] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 332.5 GB of 8 GB virtual memory used. Killing container. Dump of the process-tree for container_138615044_0001_01_13 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 22885 22036 22885 22885 (java) 5414 108 356993519616 271953 /usr/java/default/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx200m -Djava.io.tmpdir=/data/yarn/local/usercache/hive/appcache/application_138615044_0001/container_138615044_0001_01_13/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=/var/log/hadoop-yarn/containers/application_138615044_0001/container_138615044_0001_01_13 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 192.168.101.55 60841 attempt_138615044_0001_m_00_3 13 following is some of our configuration: property nameyarn.nodemanager.resource.memory-mb/name value12288/value /property property nameyarn.nodemanager.vmem-pmem-ratio/name value8/value /property property nameyarn.nodemanager.vmem-check-enabled/name valuefalse/value /property property nameyarn.nodemanager.resource.cpu-vcores/name value6/value /property can you give me some advice? thanks a lot. -- 不学习,不知道 -- 不学习,不知道 -- - Tsuyoshi -- 不学习,不知道
Re: Container [pid=22885,containerID=container_1386156666044_0001_01_000013] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 332.5 GB of 8 GB virtual memo
Hi please reference to http://hortonworks.com/blog/how-to-plan-and-configure-yarn-in-hdp-2-0/ 2013/12/5 panfei cnwe...@gmail.com we have already tried several values of these two parameters, but it seems no use. 2013/12/5 Tsuyoshi OZAWA ozawa.tsuyo...@gmail.com Hi, Please check the properties like mapreduce.reduce.memory.mb and mapredce.map.memory.mb in mapred-site.xml. These properties decide resource limits for mappers/reducers. On Wed, Dec 4, 2013 at 10:16 PM, panfei cnwe...@gmail.com wrote: -- Forwarded message -- From: panfei cnwe...@gmail.com Date: 2013/12/4 Subject: Container [pid=22885,containerID=container_138615044_0001_01_13] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 332.5 GB of 8 GB virtual memory used. Killing container. To: CDH Users cdh-u...@cloudera.org Hi All: We are using CDH4.5 Hadoop for our production, when submit some (not all) jobs from hive, we get the following exception info , seems the physical memory and virtual memory both not enough for the job to run: Task with the most failures(4): - Task ID: task_138615044_0001_m_00 URL: http://namenode-1:8088/taskdetails.jsp?jobid=job_138615044_0001tipid=task_138615044_0001_m_00 - Diagnostic Messages for this Task: Container [pid=22885,containerID=container_138615044_0001_01_13] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 332.5 GB of 8 GB virtual memory used. Killing container. Dump of the process-tree for container_138615044_0001_01_13 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 22885 22036 22885 22885 (java) 5414 108 356993519616 271953 /usr/java/default/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx200m -Djava.io.tmpdir=/data/yarn/local/usercache/hive/appcache/application_138615044_0001/container_138615044_0001_01_13/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=/var/log/hadoop-yarn/containers/application_138615044_0001/container_138615044_0001_01_13 -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 192.168.101.55 60841 attempt_138615044_0001_m_00_3 13 following is some of our configuration: property nameyarn.nodemanager.resource.memory-mb/name value12288/value /property property nameyarn.nodemanager.vmem-pmem-ratio/name value8/value /property property nameyarn.nodemanager.vmem-check-enabled/name valuefalse/value /property property nameyarn.nodemanager.resource.cpu-vcores/name value6/value /property can you give me some advice? thanks a lot. -- 不学习,不知道 -- 不学习,不知道 -- - Tsuyoshi -- 不学习,不知道
Re: Client mapred tries to renew a token with renewer specified as nobody
It is clearly mentioning that the renewer is wrong (renewer marked is 'nobody' but mapred is trying to renew the token), you may want to check this. Thanks, +Vinod On Dec 2, 2013, at 8:25 AM, Rainer Toebbicke wrote: 2013-12-02 15:57:08,541 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:mapred/xxx.cern...@cern.ch (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException: Client mapred tries to renew a token with renewer specified as nobody -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: issue about the MR JOB local dir
These are the directories where NodeManager (as configured) will store its local files. Local files includes scripts, jars, libraries - all files sent to nodes via DistributedCache. Thanks, +Vinod On Dec 3, 2013, at 5:26 PM, ch huang wrote: hi,maillist: i see three dirs on my local MR job dir ,and i do not know these dirs usage,anyone knows? # ls /data/1/mrlocal/yarn/local/ filecache/ nmPrivate/ usercache/ -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: issue about capacity scheduler
If both the jobs in the MR queue are from the same user, CapacityScheduler will only try to run them one after another. If possible, run them as different users. At which point, you will see sharing across jobs because they are from different users. Thanks, +Vinod On Dec 4, 2013, at 1:33 AM, ch huang wrote: hi,maillist : i use yarn framework and capacity scheduler ,and i have two queue ,one for hive and the other for big MR job in hive queue it's work fine,because hive task is very faster ,but what i think is user A submitted two big MR job ,so first big job eat all the resource belongs to the queue ,the other big MR job should wait until first job finished ,how can i let the same user 's MR job can run parallel? -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: issue about the MR JOB local dir
thank you ,but it seems the doc is littler old , doc says - *PUBLIC:* local-dir/filecache - *PRIVATE:* local-dir/usercache//filecache - *APPLICATION:* local-dir/usercache//appcache/app-id/ but here is my nodemanager directory,i guess nmPrivate belongs to private dir ,and filecache dir is not exist in usercache # ls /data/mrlocal/1/yarn/local filecache nmPrivate usercache [root@CHBM223 conf]# ls /data/mrlocal/1/yarn/local/filecache/ -1058429088916409723 4529188628984375230 -7014620238384418063 1084965746802723478 4537624275313838973 -7168597014714301440 -1624997938511480096 4630901056913375526 7270199361370573766 -1664837184667315424 -4642830643595652223 -7332220817185869511 1725675017861848111 4715236827440900877332904188082338506 1838346487029342338 4790459366530674957 -7450121760156930096 1865044782300039774800525395984004560 7478948409771297223 -2348110367263014791 -5080956154405911478 7486468764131639983 -2569725565520513438 524923119076958393-7755253483162230956 -2590767617048813033 -5270961733852362332 -7859425335924192987 2787947055181616358 -5381775829268220744 7967711417630616031 2816094634154232444 -5845090920164902899 8115657316961272063 286373945366133510-587409153437667574 -8196745140008584754 2931191327895309259 -5951079387471670627 -8338714062663466224 -304471400571947298 -6076923167039033115 -8473967805299855837 3250195466880585846080416638029534254 8513492322348652110 -3331048722364108374 6332597539903254606 -8567312237113801580 3360339691049457808 634308406792699 8737308241488535006 3368354412003774516566344665060319340 -8893869581665287805 3628504729266619560 -6639258108397695527 -8895898681278542021 -3801380133229678986 -6653760362065293300 8926294383627727352 3837066533086156807 -6782198269120858036 -8964326004503603190 3929223016635331138 -6814427383139267223 -9049325747073392755 4126862917222506438 -6814979781017122863 -9186700026428986961 [root@CHBM223 conf]# ls /data/mrlocal/1/yarn/local/nmPrivate/ application_1385444985453_0001 application_1385453784842_0010 application_1385522685434_0081 container_1385522685434_0073_01_14.pid application_1385445543402_0003 application_1385453784842_0013 container_1385522685434_0073_01_05.pid container_1385522685434_0073_01_17.pid application_1385445543402_0005 application_1385520079773_0005 container_1385522685434_0073_01_08.pid [root@CHBM223 conf]# ls /data/mrlocal/1/yarn/local/usercache/ hdfs helen hive root On Thu, Dec 5, 2013 at 5:12 AM, Jian He j...@hortonworks.com wrote: The following links may help you http://hortonworks.com/blog/management-of-application-dependencies-in-yarn/ http://hortonworks.com/blog/resource-localization-in-yarn-deep-dive/ Thanks, Jian On Tue, Dec 3, 2013 at 5:26 PM, ch huang justlo...@gmail.com wrote: hi,maillist: i see three dirs on my local MR job dir ,and i do not know these dirs usage,anyone knows? # ls /data/1/mrlocal/yarn/local/ filecache/ nmPrivate/ usercache/ CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: Implementing and running an applicationmaster
Hi, I took a look at the codes and found some examples on the web. One example is: http://wiki.opf-labs.org/display/SP/Resource+management It seems that users can run simple shell commands using Client of YARN. But when it comes to a practical MapReduce example like WordCount, people still run commands in the old way as in MRv1. How can I run WordCount using Client and ApplicationMaster of YARN so that I can request resources flexibly? Thanks! On Mon, Dec 2, 2013 at 11:26 AM, Rob Blah tmp5...@gmail.com wrote: Hi Follow the example provided in Yarn_dist/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-applications-distributedshell. regards tmp 2013/12/1 Yue Wang terra...@gmail.com Hi, I found the page ( http://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html) and know how to write an ApplicationMaster. However, is there a complete example showing how to run this ApplicationMaster with a real Hadoop Program (e.g. WordCount) on YARN? Thanks! Yue
get error in running terasort tool
hi,maillist: i try run terasort in my cluster ,but failed ,following is error ,i do not know why, anyone can help? # hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar terasort /alex/terasort/1G-input /alex/terasort/1G-output 13/12/05 15:15:43 INFO terasort.TeraSort: starting 13/12/05 15:15:43 INFO mapred.FileInputFormat: Total input paths to process : 1 13/12/05 15:15:45 INFO zlib.ZlibFactory: Successfully loaded initialized native-zlib library 13/12/05 15:15:45 INFO compress.CodecPool: Got brand-new compressor [.deflate] Making 1 from 10 records Step size is 10.0 13/12/05 15:15:45 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /alex/terasort/1G-input/_partition.lst could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1339) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2198) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745) at org.apache.hadoop.ipc.Client.call(Client.java:1237) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at com.sun.proxy.$Proxy9.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:291) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) at com.sun.proxy.$Proxy10.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1177) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1030) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:488) org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /alex/terasort/1G-input/_partition.lst could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1339) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2198) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745) at org.apache.hadoop.ipc.Client.call(Client.java:1237)
Re: get error in running terasort tool
BTW.i use CDH4.4 On Thu, Dec 5, 2013 at 3:18 PM, ch huang justlo...@gmail.com wrote: hi,maillist: i try run terasort in my cluster ,but failed ,following is error ,i do not know why, anyone can help? # hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar terasort /alex/terasort/1G-input /alex/terasort/1G-output 13/12/05 15:15:43 INFO terasort.TeraSort: starting 13/12/05 15:15:43 INFO mapred.FileInputFormat: Total input paths to process : 1 13/12/05 15:15:45 INFO zlib.ZlibFactory: Successfully loaded initialized native-zlib library 13/12/05 15:15:45 INFO compress.CodecPool: Got brand-new compressor [.deflate] Making 1 from 10 records Step size is 10.0 13/12/05 15:15:45 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /alex/terasort/1G-input/_partition.lst could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1339) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2198) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745) at org.apache.hadoop.ipc.Client.call(Client.java:1237) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at com.sun.proxy.$Proxy9.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:291) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) at com.sun.proxy.$Proxy10.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1177) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1030) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:488) org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /alex/terasort/1G-input/_partition.lst could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1339) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2198) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at
Re: get error in running terasort tool
can you check how many healthy data nodes available in your cluster? Use: #hadoop dfsadmin -report Regards Jitendra On Thu, Dec 5, 2013 at 12:48 PM, ch huang justlo...@gmail.com wrote: hi,maillist: i try run terasort in my cluster ,but failed ,following is error ,i do not know why, anyone can help? # hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar terasort /alex/terasort/1G-input /alex/terasort/1G-output 13/12/05 15:15:43 INFO terasort.TeraSort: starting 13/12/05 15:15:43 INFO mapred.FileInputFormat: Total input paths to process : 1 13/12/05 15:15:45 INFO zlib.ZlibFactory: Successfully loaded initialized native-zlib library 13/12/05 15:15:45 INFO compress.CodecPool: Got brand-new compressor [.deflate] Making 1 from 10 records Step size is 10.0 13/12/05 15:15:45 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /alex/terasort/1G-input/_partition.lst could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1339) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2198) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1745) at org.apache.hadoop.ipc.Client.call(Client.java:1237) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202) at com.sun.proxy.$Proxy9.addBlock(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:291) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) at com.sun.proxy.$Proxy10.addBlock(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1177) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1030) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:488) org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /alex/terasort/1G-input/_partition.lst could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1339) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2198) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:501) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:299) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44954) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1751) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1747) at java.security.AccessController.doPrivileged(Native Method) at
Re: Ant BuildException error building Hadoop 2.2.0
Hi again, I've tried to build using JDK 1.6.0_38 and I'm still getting the same exception: ~/hadoop-2.2.0-maven$ java -version java version 1.6.0_38-ea Java(TM) SE Runtime Environment (build 1.6.0_38-ea-b04) Java HotSpot(TM) 64-Bit Server VM (build 20.13-b02, mixed mode) -- [ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-common: An Ant BuildException has occured: exec returned: 1 [ERROR] around Ant part ...exec dir=/home/scaino/hadoop-2.2.0-maven/hadoop-common-project/hadoop-common/target/native executable=cmake failonerror=true... @ 4:135 in /home/scaino/hadoop-2.2.0-maven/hadoop-common-project/hadoop-common/target/antrun/build-main.xml May it be a missing dependency? Do you know how can I check the plugin actually exists using Maven? Thanks! On 4 December 2013 20:23, java8964 java8...@hotmail.com wrote: Can you try JDK 1.6? I just did a Hadoop 2.2.0 GA release build myself days ago. From my experience, JDK 1.7 not work for me. Yong -- Date: Wed, 4 Dec 2013 19:55:16 +0100 Subject: Re: Ant BuildException error building Hadoop 2.2.0 From: silvi.ca...@gmail.com To: user@hadoop.apache.org Hi, It seems I do: ~/hadoop-2.2.0-maven$ cmake --version cmake version 2.8.2 On 4 December 2013 19:51, java8964 java8...@hotmail.com wrote: Do you have 'cmake' in your environment? Yong -- Date: Wed, 4 Dec 2013 17:20:03 +0100 Subject: Ant BuildException error building Hadoop 2.2.0 From: silvi.ca...@gmail.com To: user@hadoop.apache.org Hello everyone, I've been having trouble to build Hadoop 2.2.0 using Maven 3.1.1, this is part of the output I get (full log at http://pastebin.com/FE6vu46M): [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Hadoop Main SUCCESS [27.471s] [INFO] Apache Hadoop Project POM . SUCCESS [0.936s] [INFO] Apache Hadoop Annotations . SUCCESS [3.819s] [INFO] Apache Hadoop Assemblies .. SUCCESS [0.414s] [INFO] Apache Hadoop Project Dist POM SUCCESS [1.834s] [INFO] Apache Hadoop Maven Plugins ... SUCCESS [4.693s] [INFO] Apache Hadoop MiniKDC . SUCCESS [4.346s] [INFO] Apache Hadoop Auth SUCCESS [4.923s] [INFO] Apache Hadoop Auth Examples ... SUCCESS [2.797s] [INFO] Apache Hadoop Common .. FAILURE [22.898s] [INFO] Apache Hadoop NFS . SKIPPED .. [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 1:17.655s [INFO] Finished at: Wed Dec 04 16:18:31 CET 2013 [INFO] Final Memory: 64M/420M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (make) on project hadoop-common: An Ant BuildException has occured: exec returned: 1 [ERROR] around Ant part ...exec dir=/home/scaino/hadoop-2.2.0-maven/hadoop-common-project/hadoop-common/target/native executable=cmake failonerror=true... @ 4:135 in /home/scaino/hadoop-2.2.0-maven/hadoop-common-project/hadoop-common/target/antrun/build-main.xml I've checked Protoc and it seems to be working, same with the library path, which is pointing to the libraries (installed in $HOME/install/lib): ~/hadoop-2.2.0-maven$ protoc --version libprotoc 2.5.0 ~/hadoop-2.2.0-maven$ echo $LD_LIBRARY_PATH /home/scaino/install/lib:/home/software/gcc-4.8/lib64:/home/software/mpich2-1.2.1/lib This is some system information retrieved by Maven: ~/hadoop-2.2.0-maven$ mvn --version Apache Maven 3.1.1 (0728685237757ffbf44136acec0402957f723d9a; 2013-09-17 17:22:22+0200) Maven home: /home/scaino/apache-maven-3.1.1 Java version: 1.7.0_25, vendor: Oracle Corporation Java home: /home/software/jdk1.7.0_25/jre Default locale: es_ES, platform encoding: UTF-8 OS name: linux, version: 2.6.35-32-server, arch: amd64, family: unix I would appreciate any guidelines or hints that could make me understand what is going on since anything I've tried or seen has worked so far. Thanks a lot in advance. Regards, Silvina Caíno