RE: HDFS: Couldn't obtain the locations of the last block
Hi Zesheng, I got from an offline email of you and knew your Hadoop version was 2.0.0-alpha and you also said “The block is allocated successfully in NN, but isn’t created in DN”. Yes, we may have this issue in 2.0.0-alpha. I suspect your issue is similar with HDFS-4516. And can you try Hadoop 2.4 or later, you should not be able to re-produce it for these versions. From your description, the second block is created successfully and NN would flush the edit log info to shared journal and shared storage might persist the info, but before reporting back in rpc, there might be timeout to NN from shared storage. So the block exist in shared edit log, but DN doesn’t create it in anyway. On restart, client could fail, because in that Hadoop version, client would retry only in the case of NN last block size reported as non-zero if it was synced (see more in HDFS-4516). Regards, Yi Liu From: Zesheng Wu [mailto:wuzeshen...@gmail.com] Sent: Tuesday, September 09, 2014 6:16 PM To: user@hadoop.apache.org Subject: HDFS: Couldn't obtain the locations of the last block Hi, These days we encountered a critical bug in HDFS which can result in HBase can't start normally. The scenario is like following: 1. rs1 writes data to HDFS file f1, and the first block is written successfully 2. rs1 apply to create the second block successfully, at this time, nn1(ann) is crashed due to writing journal timeout 3. nn2(snn) isn't become active because of zkfc2 is in abnormal state 4. nn1 is restarted and becomes active 5. During the process of nn1 restarting, rs1 is crashed due to writing to safemode nn(nn1) 6. As a result, the file f1 is in abnormal state and the HBase cluster can't serve any more We can use the command line shell to list the file, look like following: -rw--- 3 hbase_srv supergroup 134217728 2014-09-05 11:32 /hbase/lgsrv-push/xxx But when we try to download the file from hdfs, the dfs client complains: 14/09/09 18:12:11 WARN hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 3 times 14/09/09 18:12:15 WARN hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 2 times 14/09/09 18:12:19 WARN hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 1 times get: Could not obtain the last block locations. Anyone can help on this? -- Best Wishes! Yours, Zesheng
S3 with Hadoop 2.5.0 - Not working
Hi, I have downloaded hadoop-2.5.0 and am trying to get it working for s3 backend *(single-node in a pseudo-distributed mode)*. I have made changes to the core-site.xml according to https://wiki.apache.org/hadoop/AmazonS3 I have an backend object store running on my machine that supports S3. I get the following message when i try to start the daemons *Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured.* root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured. Starting namenodes on [] localhost: starting namenode, logging to /build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out localhost: starting datanode, logging to /build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out Starting secondary namenodes [0.0.0.0] 0.0.0.0: starting secondarynamenode, logging to /build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.out root@ubuntu:/build/hadoop/hadoop-2.5.0# The deamons dont start after the above. i get the same error if i add the property fs.defaultFS and set its value to the s3 bucket but if i change the defaultFS to *hdfs://* it works fine - am able to launch the daemons. my core-site.xml: configuration property namefs.defaultFS/name values3://bucket1/value /property property namefs.s3.awsAccessKeyId/name valueabcd/value /property property namefs.s3.awsSecretAccessKey/name value1234/value /property /configuration I am able to list the buckets and its contents via s3cmd and boto; but unable to get an s3 config started via hadoop Also from the following core-file.xml listed on the website; i dont see an implementation for s3 http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml There is an s3.impl until 1.2.1 release. So does the 2.5.0 release support s3 or do i need to do anything else. cheers, Dhiraj
Re: HDFS: Couldn't obtain the locations of the last block
Thanks Yi, I will look into HDFS-4516. 2014-09-10 15:03 GMT+08:00 Liu, Yi A yi.a@intel.com: Hi Zesheng, I got from an offline email of you and knew your Hadoop version was 2.0.0-alpha and you also said “The block is allocated successfully in NN, but isn’t created in DN”. Yes, we may have this issue in 2.0.0-alpha. I suspect your issue is similar with HDFS-4516. And can you try Hadoop 2.4 or later, you should not be able to re-produce it for these versions. From your description, the second block is created successfully and NN would flush the edit log info to shared journal and shared storage might persist the info, but before reporting back in rpc, there might be timeout to NN from shared storage. So the block exist in shared edit log, but DN doesn’t create it in anyway. On restart, client could fail, because in that Hadoop version, client would retry only in the case of NN last block size reported as non-zero if it was synced (see more in HDFS-4516). Regards, Yi Liu *From:* Zesheng Wu [mailto:wuzeshen...@gmail.com] *Sent:* Tuesday, September 09, 2014 6:16 PM *To:* user@hadoop.apache.org *Subject:* HDFS: Couldn't obtain the locations of the last block Hi, These days we encountered a critical bug in HDFS which can result in HBase can't start normally. The scenario is like following: 1. rs1 writes data to HDFS file f1, and the first block is written successfully 2. rs1 apply to create the second block successfully, at this time, nn1(ann) is crashed due to writing journal timeout 3. nn2(snn) isn't become active because of zkfc2 is in abnormal state 4. nn1 is restarted and becomes active 5. During the process of nn1 restarting, rs1 is crashed due to writing to safemode nn(nn1) 6. As a result, the file f1 is in abnormal state and the HBase cluster can't serve any more We can use the command line shell to list the file, look like following: -rw--- 3 hbase_srv supergroup 134217728 2014-09-05 11:32 /hbase/lgsrv-push/xxx But when we try to download the file from hdfs, the dfs client complains: 14/09/09 18:12:11 WARN hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 3 times 14/09/09 18:12:15 WARN hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 2 times 14/09/09 18:12:19 WARN hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 1 times get: Could not obtain the last block locations. Anyone can help on this? -- Best Wishes! Yours, Zesheng -- Best Wishes! Yours, Zesheng
Start standby namenode using bootstrapStandby hangs
Hi Experts, My hadoop cluster is enabled HA with QJM and I failed to upgrade it from version 2.2.0 to 2.4.1. Why? Is this a existing issue? My steps: 1. Stop hadoop cluster 2. On each node, upgrade hadoop binary with the newer version 3. On each JournalNode: sbin/hadoop-daemon.sh start journalnode 4. On each DataNode: sbin/hadoop-daemon.sh start datanode 5. On previous active NameNode: sbin/hadoop-daemon.sh start namenode -upgrade 6. On previous standby NameNode: sbin/hadoop-daemon.sh start namenode -bootstrapStandby Encountered Issue: Failed to start NameNode service normally with a warning info as below: 2014-09-10 15:57:41,730 WARN org.apache.hadoop.hdfs.server.common.Util: Path /hadoop/hdfs/name should be specified as a URI in configuration files. Please update hdfs configuration. After throwing out above warning info, the execution of command hangs there and did not throw any other warning/error messages any more. Thanks!
Re: S3 with Hadoop 2.5.0 - Not working
Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured. Starting namenodes on [] NameNode/DataNode are part of a HDFS service. It makes no sense to try and run them over an S3 URL default, which is a distributed filesystem in itself. The services need fs.defaultFS to be set to a HDFS URI to be able to start up. but unable to get an s3 config started via hadoop You can run jobs over S3 input and output data by running a regular MR cluster on HDFS - just pass the right URI as input and output parameters of the job. Set your S3 properties in core-site.xml but let the fs.defaultFS be of HDFS type, to do this. There is an s3.impl until 1.2.1 release. So does the 2.5.0 release support s3 or do i need to do anything else. In Apache Hadoop 2 we dynamically load the FS classes, so we do not need the fs.NAME.impl configs anymore as we did in Apache Hadoop 1. On Wed, Sep 10, 2014 at 1:15 PM, Dhiraj jar...@gmail.com wrote: Hi, I have downloaded hadoop-2.5.0 and am trying to get it working for s3 backend (single-node in a pseudo-distributed mode). I have made changes to the core-site.xml according to https://wiki.apache.org/hadoop/AmazonS3 I have an backend object store running on my machine that supports S3. I get the following message when i try to start the daemons Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured. root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured. Starting namenodes on [] localhost: starting namenode, logging to /build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out localhost: starting datanode, logging to /build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out Starting secondary namenodes [0.0.0.0] 0.0.0.0: starting secondarynamenode, logging to /build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.out root@ubuntu:/build/hadoop/hadoop-2.5.0# The deamons dont start after the above. i get the same error if i add the property fs.defaultFS and set its value to the s3 bucket but if i change the defaultFS to hdfs:// it works fine - am able to launch the daemons. my core-site.xml: configuration property namefs.defaultFS/name values3://bucket1/value /property property namefs.s3.awsAccessKeyId/name valueabcd/value /property property namefs.s3.awsSecretAccessKey/name value1234/value /property /configuration I am able to list the buckets and its contents via s3cmd and boto; but unable to get an s3 config started via hadoop Also from the following core-file.xml listed on the website; i dont see an implementation for s3 http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml There is an s3.impl until 1.2.1 release. So does the 2.5.0 release support s3 or do i need to do anything else. cheers, Dhiraj -- Harsh J
RE: Error and problem when running a hadoop job
Thank you for your all support. I could fix the issue this morning using this link, it was clearly explain. http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/#java-io-ioexception-incompatible-namespaceids You can use the link as well. Warm regard From: vivek [mailto:vivvekbha...@gmail.com] Sent: Tuesday 9 September 2014 19:31 To: user@hadoop.apache.org Subject: Re: Error and problem when running a hadoop job is there any namespace mismatch? Try to delete the data in datanode directory On Tue, Sep 9, 2014 at 10:41 PM, Sandeep Khurana skhurana...@gmail.commailto:skhurana...@gmail.com wrote: check the log file at ./hadoop/hadoop-datanide-latdevweb02.out (As per ur last screen shot). There can be various reasons of datanode not starting, the real issue will be logged into this file. On Tue, Sep 9, 2014 at 10:06 PM, YIMEN YIMGA Gael gael.yimen-yi...@sgcib.commailto:gael.yimen-yi...@sgcib.com wrote: Hi, When I run the following command to launch DATANODE as shown in the screenshot below, all is ok But when I run JPS command, I do not see the datanode process [cid:image001.png@01CFCCED.E2FD4BC0] That’s where my worry is ☹ ☹ Standing by …. From: vivek [mailto:vivvekbha...@gmail.commailto:vivvekbha...@gmail.com] Sent: Tuesday 9 September 2014 17:27 To: user@hadoop.apache.orgmailto:user@hadoop.apache.org Subject: Re: Error and problem when running a hadoop job check whether datanode is started. On Tue, Sep 9, 2014 at 7:26 PM, YIMEN YIMGA Gael gael.yimen-yi...@sgcib.commailto:gael.yimen-yi...@sgcib.com wrote: Yes, all about ssh access, have been done. My cluster is a single node cluster. Standing by … From: Sandeep Khurana [mailto:skhurana...@gmail.commailto:skhurana...@gmail.com] Sent: Tuesday 9 September 2014 15:54 To: user@hadoop.apache.orgmailto:user@hadoop.apache.org Subject: Re: Error and problem when running a hadoop job I hope you did do passphrase less ssh access to localhost by generating keys etc? On Sep 9, 2014 7:18 PM, YIMEN YIMGA Gael gael.yimen-yi...@sgcib.commailto:gael.yimen-yi...@sgcib.com wrote: Hello Dear hadoopers, I hope you are doing well. I tried to run WordCount.jar file to experience running hadoop jobs. After launching the program as shown in the screenshot below, I have the message in the screenshot. The job tries to connect to the datanode. But failed after 10 attempts, I got the error in the second screenshot. After that, I first stop all the Hadoop deamons, second format the dfs, third re-launch Hadoop deamons, and I notice using the JPS command that DATANODE is not running. I then run the datanode alone with the command bin/hadoop –deamon.sh start datanode as shown in the third screenshot, but the datanode is still not up and running. Could someone advice in this case, please ? Standing by for your habitual support. Thank in advance. GYY [cid:image002.png@01CFCCED.E2FD4BC0] [cid:image003.png@01CFCCED.E2FD4BC0] [cid:image004.png@01CFCCED.E2FD4BC0] * This message and any attachments (the message) are confidential, intended solely for the addressee(s), and may contain legally privileged information. Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or falsified. Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products. Ce message et toutes les pieces jointes (ci-apres le message) sont confidentiels et susceptibles de contenir des informations couvertes par le secret professionnel. Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite. Tout message electronique est susceptible d'alteration. La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie. Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir d'importantes informations sur les produits derives. * -- Thanks and Regards, VIVEK KOUL -- Thanks and regards Sandeep Khurana -- Thanks and Regards, VIVEK KOUL
MapReduce data decompression using a custom codec
Hello, I developed a custom compression codec for Hadoop. Of course Hadoop is set to use my codec when compressing data. For testing purposes, I use the following two commands: Compression test command: --- hadoop jar /opt/cloudera/parcels/CDH-5.1.2-1.cdh5.1.2.p0.3/lib/hadoop//../hadoop-mapreduce/hadoop-streaming.jar -Dmapreduce.output.fileoutputformat.compress=true -input /originalFiles/ -output /compressedFiles/ -mapper cat -reducer cat Decompression test command: --- hadoop jar /opt/cloudera/parcels/CDH-5.1.2-1.cdh5.1.2.p0.3/lib/hadoop//../hadoop-mapreduce/hadoop-streaming.jar -Dmapreduce.output.fileoutputformat.compress=false -input /compressedFiles/ -output /decompressedFiles/ -mapper cat -reducer cat As you can see, both of them are quite similar: only the compression option changes and the input/output directories. The first command compresses the input data then 'cat' (the Linux command, you know) it to the output file. The second one decompresses the input data (which are supposed to be compressed) then 'cat' it to the output file. As I understand, Hadoop is supposed to auto-detect compressed input data and decompress it using the right codec. Those test compression and decompression work well when Hadoop is set to use a default codec, like BZip2 or Snappy. However, when using my custom compression codec, only the compression works: the decompression is sluggish and triggers errors (Java heap space): packageJobJar: [] [/opt/cloudera/parcels/CDH-5.1.2-1.cdh5.1.2.p0.3/lib/hadoop-mapreduce/hadoop-streaming-2.3.0-cdh5.1.2.jar] /tmp/streamjob6475393520304432687.jar tmpDir=null 14/09/09 15:33:21 INFO client.RMProxy: Connecting to ResourceManager at bluga2/10.1.96.222:8032 14/09/09 15:33:22 INFO client.RMProxy: Connecting to ResourceManager at bluga2/10.1.96.222:8032 14/09/09 15:33:23 INFO mapred.FileInputFormat: Total input paths to process : 1 14/09/09 15:33:23 INFO mapreduce.JobSubmitter: number of splits:1 14/09/09 15:33:23 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1410264242020_0016 14/09/09 15:33:24 INFO impl.YarnClientImpl: Submitted application application_1410264242020_0016 14/09/09 15:33:24 INFO mapreduce.Job: The url to track the job: http://bluga2:8088/proxy/application_1410264242020_0016/ 14/09/09 15:33:24 INFO mapreduce.Job: Running job: job_1410264242020_0016 14/09/09 15:33:30 INFO mapreduce.Job: Job job_1410264242020_0016 running in uber mode : false 14/09/09 15:33:30 INFO mapreduce.Job: map 0% reduce 0% 14/09/09 15:35:12 INFO mapreduce.Job: map 100% reduce 0% 14/09/09 15:35:13 INFO mapreduce.Job: Task Id : attempt_1410264242020_0016_m_00_0, Status : FAILED Error: Java heap space 14/09/09 15:35:14 INFO mapreduce.Job: map 0% reduce 0% 14/09/09 15:35:41 INFO mapreduce.Job: Task Id : attempt_1410264242020_0016_m_00_1, Status : FAILED Error: Java heap space 14/09/09 15:36:02 INFO mapreduce.Job: Task Id : attempt_1410264242020_0016_m_00_2, Status : FAILED Error: Java heap space 14/09/09 15:36:49 INFO mapreduce.Job: map 100% reduce 0% 14/09/09 15:36:50 INFO mapreduce.Job: map 100% reduce 100% 14/09/09 15:36:56 INFO mapreduce.Job: Job job_1410264242020_0016 failed with state FAILED due to: Task failed task_1410264242020_0016_m_00 Job failed as tasks failed. failedMaps:1 failedReduces:0 14/09/09 15:36:58 INFO mapreduce.Job: Counters: 9 Job Counters Failed map tasks=4 Launched map tasks=4 Other local map tasks=3 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=190606 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=190606 Total vcore-seconds taken by all map tasks=190606 Total megabyte-seconds taken by all map tasks=195180544 14/09/09 15:36:58 ERROR streaming.StreamJob: Job not Successful! Streaming Command Failed! I already tried to increase the map maximum heap size (mapreduce.map.java.opts.max.heap's YARN property) from 1 GiB to 2 GiB but the decompression still doesn't work. By the way, I'm compressing and decompressing a small ~2MB file and use the latest Cloudera version. I built a quick Java test environment to try to reproduce the Hadoop codec call (instantiating the codec, creating a new compression stream from it ...). I noticed that the decompression is an infinite loop where only the first block of compressed data is decompressed, infinitely. This could explain the above Java heap space error. What am I doing wrong/what did I forget ? How could my codec decompress data without troubles? Thank you for helping ! Kévin Poupon
Re: Regular expressions in fs paths?
I want to unsubscribe from this mailing list On Wed, Sep 10, 2014 at 4:42 PM, Charles Robertson charles.robert...@gmail.com wrote: Hi all, Is it possible to use regular expressions in fs commands? Specifically, I want to use the copy (-cp) and move (-mv) commands on all files in a directory that match a pattern (the pattern being all files that do not end in '.tmp'). Can this be done? Thanks, Charles
Re: Regular expressions in fs paths?
Yes you can : hadoop fs -ls /tmp/myfiles* I would recommend first using -ls in order to verify you are selecting the right files. #Mahesh : do you need some help doing this ? On 10.09.2014 13:46, Mahesh Khandewal wrote: I want to unsubscribe from this mailing list On Wed, Sep 10, 2014 at 4:42 PM, Charles Robertson charles.robert...@gmail.com mailto:charles.robert...@gmail.com wrote: Hi all, Is it possible to use regular expressions in fs commands? Specifically, I want to use the copy (-cp) and move (-mv) commands on all files in a directory that match a pattern (the pattern being all files that do not end in '.tmp'). Can this be done? Thanks, Charles
Re: HDFS: Couldn't obtain the locations of the last block
Hi Yi, I went through HDFS-4516, and it really solves our problem, thanks very much! 2014-09-10 16:39 GMT+08:00 Zesheng Wu wuzeshen...@gmail.com: Thanks Yi, I will look into HDFS-4516. 2014-09-10 15:03 GMT+08:00 Liu, Yi A yi.a@intel.com: Hi Zesheng, I got from an offline email of you and knew your Hadoop version was 2.0.0-alpha and you also said “The block is allocated successfully in NN, but isn’t created in DN”. Yes, we may have this issue in 2.0.0-alpha. I suspect your issue is similar with HDFS-4516. And can you try Hadoop 2.4 or later, you should not be able to re-produce it for these versions. From your description, the second block is created successfully and NN would flush the edit log info to shared journal and shared storage might persist the info, but before reporting back in rpc, there might be timeout to NN from shared storage. So the block exist in shared edit log, but DN doesn’t create it in anyway. On restart, client could fail, because in that Hadoop version, client would retry only in the case of NN last block size reported as non-zero if it was synced (see more in HDFS-4516). Regards, Yi Liu *From:* Zesheng Wu [mailto:wuzeshen...@gmail.com] *Sent:* Tuesday, September 09, 2014 6:16 PM *To:* user@hadoop.apache.org *Subject:* HDFS: Couldn't obtain the locations of the last block Hi, These days we encountered a critical bug in HDFS which can result in HBase can't start normally. The scenario is like following: 1. rs1 writes data to HDFS file f1, and the first block is written successfully 2. rs1 apply to create the second block successfully, at this time, nn1(ann) is crashed due to writing journal timeout 3. nn2(snn) isn't become active because of zkfc2 is in abnormal state 4. nn1 is restarted and becomes active 5. During the process of nn1 restarting, rs1 is crashed due to writing to safemode nn(nn1) 6. As a result, the file f1 is in abnormal state and the HBase cluster can't serve any more We can use the command line shell to list the file, look like following: -rw--- 3 hbase_srv supergroup 134217728 2014-09-05 11:32 /hbase/lgsrv-push/xxx But when we try to download the file from hdfs, the dfs client complains: 14/09/09 18:12:11 WARN hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 3 times 14/09/09 18:12:15 WARN hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 2 times 14/09/09 18:12:19 WARN hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 1 times get: Could not obtain the last block locations. Anyone can help on this? -- Best Wishes! Yours, Zesheng -- Best Wishes! Yours, Zesheng -- Best Wishes! Yours, Zesheng
Re: Regular expressions in fs paths?
Hi Georgi, Thanks for your reply. Won't hadoop fs -ls /tmp/myfiles* return all files that begin with 'myfiles' in the tmp directory? What I don't understand is how I can specify a pattern that excludes files ending in '.tmp'. I have tried using the normal regular expression syntax for this ^(.tmp) but it tries to match it literally. Regards, Charles On 10 September 2014 13:07, Georgi Ivanov iva...@vesseltracker.com wrote: Yes you can : hadoop fs -ls /tmp/myfiles* I would recommend first using -ls in order to verify you are selecting the right files. #Mahesh : do you need some help doing this ? On 10.09.2014 13:46, Mahesh Khandewal wrote: I want to unsubscribe from this mailing list On Wed, Sep 10, 2014 at 4:42 PM, Charles Robertson charles.robert...@gmail.com wrote: Hi all, Is it possible to use regular expressions in fs commands? Specifically, I want to use the copy (-cp) and move (-mv) commands on all files in a directory that match a pattern (the pattern being all files that do not end in '.tmp'). Can this be done? Thanks, Charles
RE: HDFS: Couldn't obtain the locations of the last block
That’s great. Regards, Yi Liu From: Zesheng Wu [mailto:wuzeshen...@gmail.com] Sent: Wednesday, September 10, 2014 8:25 PM To: user@hadoop.apache.org Subject: Re: HDFS: Couldn't obtain the locations of the last block Hi Yi, I went through HDFS-4516, and it really solves our problem, thanks very much! 2014-09-10 16:39 GMT+08:00 Zesheng Wu wuzeshen...@gmail.commailto:wuzeshen...@gmail.com: Thanks Yi, I will look into HDFS-4516. 2014-09-10 15:03 GMT+08:00 Liu, Yi A yi.a@intel.commailto:yi.a@intel.com: Hi Zesheng, I got from an offline email of you and knew your Hadoop version was 2.0.0-alpha and you also said “The block is allocated successfully in NN, but isn’t created in DN”. Yes, we may have this issue in 2.0.0-alpha. I suspect your issue is similar with HDFS-4516. And can you try Hadoop 2.4 or later, you should not be able to re-produce it for these versions. From your description, the second block is created successfully and NN would flush the edit log info to shared journal and shared storage might persist the info, but before reporting back in rpc, there might be timeout to NN from shared storage. So the block exist in shared edit log, but DN doesn’t create it in anyway. On restart, client could fail, because in that Hadoop version, client would retry only in the case of NN last block size reported as non-zero if it was synced (see more in HDFS-4516). Regards, Yi Liu From: Zesheng Wu [mailto:wuzeshen...@gmail.commailto:wuzeshen...@gmail.com] Sent: Tuesday, September 09, 2014 6:16 PM To: user@hadoop.apache.orgmailto:user@hadoop.apache.org Subject: HDFS: Couldn't obtain the locations of the last block Hi, These days we encountered a critical bug in HDFS which can result in HBase can't start normally. The scenario is like following: 1. rs1 writes data to HDFS file f1, and the first block is written successfully 2. rs1 apply to create the second block successfully, at this time, nn1(ann) is crashed due to writing journal timeout 3. nn2(snn) isn't become active because of zkfc2 is in abnormal state 4. nn1 is restarted and becomes active 5. During the process of nn1 restarting, rs1 is crashed due to writing to safemode nn(nn1) 6. As a result, the file f1 is in abnormal state and the HBase cluster can't serve any more We can use the command line shell to list the file, look like following: -rw--- 3 hbase_srv supergroup 134217728 2014-09-05 11:32 /hbase/lgsrv-push/xxx But when we try to download the file from hdfs, the dfs client complains: 14/09/09 18:12:11 WARN hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 3 times 14/09/09 18:12:15 WARN hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 2 times 14/09/09 18:12:19 WARN hdfs.DFSClient: Last block locations not available. Datanodes might not have reported blocks completely. Will retry for 1 times get: Could not obtain the last block locations. Anyone can help on this? -- Best Wishes! Yours, Zesheng -- Best Wishes! Yours, Zesheng -- Best Wishes! Yours, Zesheng
Error when executing a WordCount Program
Hello Hadoopers, Here is the error, I'm facing when running WordCount example program written by myself. Kind find attached the file of my WordCount program. Below the error. === -bash-4.1$ bin/hadoop jar WordCount.jar Entr?e dans le programme MAIN !!! 14/09/10 15:00:24 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/09/10 15:00:24 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 14/09/10 15:00:24 INFO util.NativeCodeLoader: Loaded the native-hadoop library 14/09/10 15:00:24 WARN snappy.LoadSnappy: Snappy native library not loaded 14/09/10 15:00:24 INFO mapred.JobClient: Cleaning up the staging area hdfs://latdevweb02:9000/user/hadoop/.staging/job_201409101141_0001 14/09/10 15:00:24 ERROR security.UserGroupInformation: PriviledgedActionException as:hadoop cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://latdevweb02:9000/home/hadoop/hadoop/input org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://latdevweb02:9000/home/hadoop/hadoop/input at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:197) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1353) at fr.societegenerale.bigdata.lactool.WordCountDriver.main(WordCountDriver.java:50) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) -bash-4.1$ === Thanks in advance for your help. Warm regards GYY * This message and any attachments (the message) are confidential, intended solely for the addressee(s), and may contain legally privileged information. Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or falsified. Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products. Ce message et toutes les pieces jointes (ci-apres le message) sont confidentiels et susceptibles de contenir des informations couvertes par le secret professionnel. Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite. Tout message electronique est susceptible d'alteration. La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie. Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir d'importantes informations sur les produits derives. * WordCountReducer.java Description: WordCountReducer.java WordCountMapper.java Description: WordCountMapper.java WordCountDriver.java Description: WordCountDriver.java
Re: Error when executing a WordCount Program
*hdfs://latdevweb02:9000/home/hadoop/hadoop/input* is this is a valid path on hdfs? Can you access this path outside of the program? For example using hadoop fs -ls command? Also, was this path and files in it, created by a different user? The exception seem to say that it does not exist or the running user does not have permission to read it. Regards, Shahab On Wed, Sep 10, 2014 at 9:09 AM, YIMEN YIMGA Gael gael.yimen-yi...@sgcib.com wrote: Hello Hadoopers, Here is the error, I’m facing when running WordCount example program written by myself. Kind find attached the file of my WordCount program. Below the error. === *-bash-4.1$ bin/hadoop jar WordCount.jar* *Entr?e dans le programme MAIN !!!* *14/09/10 15:00:24 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.* *14/09/10 15:00:24 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String).* *14/09/10 15:00:24 INFO util.NativeCodeLoader: Loaded the native-hadoop library* *14/09/10 15:00:24 WARN snappy.LoadSnappy: Snappy native library not loaded* *14/09/10 15:00:24 INFO mapred.JobClient: Cleaning up the staging area hdfs://latdevweb02:9000/user/hadoop/.staging/job_201409101141_0001* *14/09/10 15:00:24 ERROR security.UserGroupInformation: PriviledgedActionException as:hadoop cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://latdevweb02:9000/home/hadoop/hadoop/input* *org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://latdevweb02:9000/home/hadoop/hadoop/input* *at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:197)* *at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208)* *at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081)* *at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073)* *at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179)* *at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983)* *at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)* *at java.security.AccessController.doPrivileged(Native Method)* *at javax.security.auth.Subject.doAs(Subject.java:415)* *at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)* *at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)* *at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910)* *at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1353)* *at fr.societegenerale.bigdata.lactool.WordCountDriver.main(WordCountDriver.java:50)* *at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)* *at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)* *at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)* *at java.lang.reflect.Method.invoke(Method.java:601)* *at org.apache.hadoop.util.RunJar.main(RunJar.java:160)* *-bash-4.1$* === Thanks in advance for your help. Warm regards GYY * This message and any attachments (the message) are confidential, intended solely for the addressee(s), and may contain legally privileged information. Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or falsified. Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products. Ce message et toutes les pieces jointes (ci-apres le message) sont confidentiels et susceptibles de contenir des informations couvertes par le secret professionnel. Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite. Tout message electronique est susceptible d'alteration. La SOCIETE GENERALE et ses filiales declinent toute responsabilite au titre de ce message s'il a ete altere, deforme ou falsifie. Veuillez consulter le site http://swapdisclosure.sgcib.com afin de recueillir d'importantes informations sur les produits derives. *
Re: Error when executing a WordCount Program
Hi have you set a class in your code ? WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). Also you need to check the path for your input file Input path does not exist: hdfs://latdevweb02:9000/home/hadoop/hadoop/input These are pretty straight forward errors resolve them and you should be good to go. Sent from my iPhone On 10 Sep 2014, at 14:19, Shahab Yunus shahab.yu...@gmail.com wrote: hdfs://latdevweb02:9000/home/hadoop/hadoop/input is this is a valid path on hdfs? Can you access this path outside of the program? For example using hadoop fs -ls command? Also, was this path and files in it, created by a different user? The exception seem to say that it does not exist or the running user does not have permission to read it. Regards, Shahab On Wed, Sep 10, 2014 at 9:09 AM, YIMEN YIMGA Gael gael.yimen-yi...@sgcib.com wrote: Hello Hadoopers, Here is the error, I’m facing when running WordCount example program written by myself. Kind find attached the file of my WordCount program. Below the error. === -bash-4.1$ bin/hadoop jar WordCount.jar Entr?e dans le programme MAIN !!! 14/09/10 15:00:24 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/09/10 15:00:24 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 14/09/10 15:00:24 INFO util.NativeCodeLoader: Loaded the native-hadoop library 14/09/10 15:00:24 WARN snappy.LoadSnappy: Snappy native library not loaded 14/09/10 15:00:24 INFO mapred.JobClient: Cleaning up the staging area hdfs://latdevweb02:9000/user/hadoop/.staging/job_201409101141_0001 14/09/10 15:00:24 ERROR security.UserGroupInformation: PriviledgedActionException as:hadoop cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://latdevweb02:9000/home/hadoop/hadoop/input org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://latdevweb02:9000/home/hadoop/hadoop/input at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:197) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1353) at fr.societegenerale.bigdata.lactool.WordCountDriver.main(WordCountDriver.java:50) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) -bash-4.1$ === Thanks in advance for your help. Warm regards GYY * This message and any attachments (the message) are confidential, intended solely for the addressee(s), and may contain legally privileged information. Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or falsified. Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products. Ce message et toutes les pieces jointes (ci-apres le message) sont confidentiels et susceptibles de contenir des informations couvertes par le secret professionnel. Ce message est etabli a l'intention exclusive de ses destinataires. Toute
running beyond virtual memory limits
Hello, I am getting following error when running on 500MB dataset compressed in avro data format. Container [pid=22961,containerID=container_1409834588043_0080_01_10] is running beyond virtual memory limits. Current usage: 636.6 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used. Killing container. Dump of the process-tree for container_1409834588043_0080_01_10 : |- PIDPPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 22961 16896 22961 22961 (bash)0 0 9424896 312 /bin/bash -c /usr/java/default/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx768m -Djava.io.tmpdir=/home/hadoop/yarn/local/usercache/jobsubmit/appcache/application_1409834588043_0080/container_1409834588043_0080_01_10/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/hadoop/yarn/logs/application_1409834588043_0080/container_1409834588043_0080_01_10 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 153.87.47.116 47184 attempt_1409834588043_0080_r_00_0 10 1/home/hadoop/yarn/logs/application_1409834588043_0080/container_1409834588043_0080_01_10/stdout 2/home/hadoop/yarn/logs/application_1409834588043_0080/container_1409834588043_0080_01_10/stderr |- 22970 22961 22961 22961 (java) 24692 1165 2256662528 162659 /usr/java/default/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Xmx768m -Djava.io.tmpdir=/home/hadoop/yarn/local/usercache/jobsubmit/appcache/application_1409834588043_0080/container_1409834588043_0080_01_10/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/home/hadoop/yarn/logs/application_1409834588043_0080/container_1409834588043_0080_01_10 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 153.87.47.116 47184 attempt_1409834588043_0080_r_00_0 10 Container killed on request. Exit code is 143 I have read a lot about hadoop yarn memory settings but seems that something basic I am missing in my understanding of how yarn and MR2 works. I have pretty small testing cluster of 5 machines, 2nn and 3dn with following parameters set # hadoop - yarn-site.xml yarn.nodemanager.resource.memory-mb : 2048 yarn.scheduler.minimum-allocation-mb : 256 yarn.scheduler.maximum-allocation-mb : 2048 # hadoop - mapred-site.xml mapreduce.map.memory.mb : 768 mapreduce.map.java.opts : -Xmx512m mapreduce.reduce.memory.mb : 1024 mapreduce.reduce.java.opts : -Xmx768m mapreduce.task.io.sort.mb: 100 yarn.app.mapreduce.am.resource.mb: 1024 yarn.app.mapreduce.am.command-opts : -Xmx768m I understand the mathematics here for the parameters but what I do not understand is: Does your containers need to grow with the size of your dataset? e.g. setting of mapreduce.map.memory.mb and mapreduce.map.java.opts on per job basis? My reducer doesn't cache any data, it is simply in - out just categorize data to multiple outputs as follows using AvroMultipleOutputs() @Override public void reduce(Text key, IterableAvroValuePosData values, Context context) throws IOException, InterruptedException { try { log.info(Processing key {}, key.toString()); final StoreIdDob storeIdDob = separateKey(key); log.info(Processing DOB {}, SotoreId {}, storeIdDob.getDob(), storeIdDob.getStoreId()); int size = 0; Output out; String path; if (storeIdDob.getDob() != null isValidDOB(storeIdDob.getDob()) storeIdDob.getStoreId() != null !storeIdDob.getStoreId().isEmpty()) { // reasonable data if (isHistoricalDOB(storeIdDob.getDob())) { out = Output.HISTORY; } else { out = Output.ACTUAL; } path = out.getKey() + / + storeIdDob.getDob() + / + storeIdDob.getStoreId(); } else { // error data out = Output.ERROR; path = out.getKey() + / + part; } for (AvroValuePosData posData : values) { amos.write(out.getKey(), new AvroKeyPosData (posData.datum()), null, path); } } catch (Exception e) { log.error(Error on reducer , e); //TODO audit log :-) } } Do I need to grow the container size with size of the dataset? That seems to me odd and I did expect that is what MR is for. Or am I missing some settings which decides the size of data chunks? Thx Jakub
RE: Error when executing a WordCount Program
Hi, Please that is my real problem. Could you please look into my code in attached and tell me how I can update this, please ? How to set a job jar file? And now, here is my hdfs-site.xml == -bash-4.1$ cat conf/hdfs-site.xml ?xml version=1.0? ?xml-stylesheet type=text/xsl href=configuration.xsl? !-- Put site-specific property overrides in this file. -- configuration property namedfs.replication/name value1/value /property property namedfs.data.dir/name value/tmp/hadoop-hadoop/dfs/data/value /property /configuration -bash-4.1$ == Could you advice on how to solve the error of “input path does not exist”? Standing by … Cheers From: Chris MacKenzie [mailto:stu...@chrismackenziephotography.co.uk] Sent: Wednesday 10 September 2014 15:27 To: user@hadoop.apache.org Subject: Re: Error when executing a WordCount Program Hi have you set a class in your code ? WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). Also you need to check the path for your input file Input path does not exist: hdfs://latdevweb02:9000/home/hadoop/hadoop/input These are pretty straight forward errors resolve them and you should be good to go. Sent from my iPhone On 10 Sep 2014, at 14:19, Shahab Yunus shahab.yu...@gmail.commailto:shahab.yu...@gmail.com wrote: hdfs://latdevweb02:9000/home/hadoop/hadoop/input is this is a valid path on hdfs? Can you access this path outside of the program? For example using hadoop fs -ls command? Also, was this path and files in it, created by a different user? The exception seem to say that it does not exist or the running user does not have permission to read it. Regards, Shahab On Wed, Sep 10, 2014 at 9:09 AM, YIMEN YIMGA Gael gael.yimen-yi...@sgcib.commailto:gael.yimen-yi...@sgcib.com wrote: Hello Hadoopers, Here is the error, I’m facing when running WordCount example program written by myself. Kind find attached the file of my WordCount program. Below the error. === -bash-4.1$ bin/hadoop jar WordCount.jar Entr?e dans le programme MAIN !!! 14/09/10 15:00:24 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/09/10 15:00:24 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 14/09/10 15:00:24 INFO util.NativeCodeLoader: Loaded the native-hadoop library 14/09/10 15:00:24 WARN snappy.LoadSnappy: Snappy native library not loaded 14/09/10 15:00:24 INFO mapred.JobClient: Cleaning up the staging area hdfs://latdevweb02:9000/user/hadoop/.staging/job_201409101141_0001 14/09/10 15:00:24 ERROR security.UserGroupInformation: PriviledgedActionException as:hadoop cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://latdevweb02:9000/home/hadoop/hadoop/input org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://latdevweb02:9000/home/hadoop/hadoop/input at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:197) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1353) at fr.societegenerale.bigdata.lactool.WordCountDriver.main(WordCountDriver.java:50) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) -bash-4.1$ === Thanks in advance for your help. Warm regards GYY * This message and any attachments (the
RE: Error when executing a WordCount Program
Hi, In fact, hdfs://latdevweb02:9000/home/hadoop/hadoop/input is not a folder on hdfs. I created a folder /tmp/hadoop-hadoop/dfs/data, where data will be saved in hdfs. And in my HADOOP_HOME folder, there is two folders “input” and “output”, but I don’t know how to configure them in the program. Please could you look into my code and advise please ? Standing by … Warm regards From: Shahab Yunus [mailto:shahab.yu...@gmail.com] Sent: Wednesday 10 September 2014 15:19 To: user@hadoop.apache.org Subject: Re: Error when executing a WordCount Program hdfs://latdevweb02:9000/home/hadoop/hadoop/input is this is a valid path on hdfs? Can you access this path outside of the program? For example using hadoop fs -ls command? Also, was this path and files in it, created by a different user? The exception seem to say that it does not exist or the running user does not have permission to read it. Regards, Shahab On Wed, Sep 10, 2014 at 9:09 AM, YIMEN YIMGA Gael gael.yimen-yi...@sgcib.commailto:gael.yimen-yi...@sgcib.com wrote: Hello Hadoopers, Here is the error, I’m facing when running WordCount example program written by myself. Kind find attached the file of my WordCount program. Below the error. === -bash-4.1$ bin/hadoop jar WordCount.jar Entr?e dans le programme MAIN !!! 14/09/10 15:00:24 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/09/10 15:00:24 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 14/09/10 15:00:24 INFO util.NativeCodeLoader: Loaded the native-hadoop library 14/09/10 15:00:24 WARN snappy.LoadSnappy: Snappy native library not loaded 14/09/10 15:00:24 INFO mapred.JobClient: Cleaning up the staging area hdfs://latdevweb02:9000/user/hadoop/.staging/job_201409101141_0001 14/09/10 15:00:24 ERROR security.UserGroupInformation: PriviledgedActionException as:hadoop cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://latdevweb02:9000/home/hadoop/hadoop/input org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://latdevweb02:9000/home/hadoop/hadoop/input at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:197) at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:208) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:1081) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1073) at org.apache.hadoop.mapred.JobClient.access$700(JobClient.java:179) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:983) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:910) at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1353) at fr.societegenerale.bigdata.lactool.WordCountDriver.main(WordCountDriver.java:50) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:160) -bash-4.1$ === Thanks in advance for your help. Warm regards GYY * This message and any attachments (the message) are confidential, intended solely for the addressee(s), and may contain legally privileged information. Any unauthorised use or dissemination is prohibited. E-mails are susceptible to alteration. Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates shall be liable for the message if altered, changed or falsified. Please visit http://swapdisclosure.sgcib.com for important information with respect to derivative products. Ce message et toutes les pieces jointes (ci-apres le message) sont confidentiels et susceptibles de contenir des informations couvertes par le secret professionnel. Ce message est etabli a l'intention exclusive de ses destinataires. Toute utilisation ou diffusion non autorisee est interdite. Tout message electronique
Hadoop Smoke Test: TERASORT
Hi, I am trying the smoke test for Hadoop (2.4.1). About “terasort”, below is my test command, the Map part was completed very fast because it was split into many subtasks, however the Reduce part takes very long time and only 1 running Reduce job. Is there a way speed up the reduce phase by splitting the large reduce job into many smaller ones and run them across the cluster like the Map part? bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar terasort /tmp/teragenout /tmp/terasortout Job ID NameState Maps Total Maps Completed Reduce Total Reduce Complted job_1409876705457_0002 TeraSortRUNNING 22352 22352 1 0 Regards Arthur
Re: Regular expressions in fs paths?
HDFS doesn't support he full range of glob matching you will find in Linux. If you want to exclude all files from a directory listing that meet a certain criteria try doing your listing and using grep -v to exclude the matching records.
Writing output from streaming task without dealing with key/value
Hello! Imagine the following common task: I want to process big text file line-by-line using streaming interface. Run unix grep command for instance. Or some other line-by-line processing, e.g. line.upper(). I copy file to HDFS. Then I run a map task on this file which reads one line, modifies it some way and then writes it to the output. TextInputFormat suites well for reading: it's key is the offset in bytes (meaningless in my case) and the value is the line itself, so I can iterate over line like this (in python): for line in sys.stdin: print(line.upper()) The problem arises with TextOutputFormat: It tries to split the resulting line on mapreduce.output.textoutputformat.separator which results in extra separator in output if this character is missing in the line, for instance (extra TAB at the end if we stick to defaults). Is there any way to write the result of streaming task without any internal processing so it appears exactly as the script produces it? If it is impossible with Hadoop, which works with key/value pairs, may be there are other frameworks which work on top of HDFS which allow to do this? Thanks in advance!
Re: Hadoop Smoke Test: TERASORT
You can set the number of reducers used in any hadoop job from the command line by using -Dmapred.reduce.tasks=XX. e.g. hadoop jar hadoop-mapreduce-examples.jar terasort -Dmapred.reduce.tasks=10 /terasort-input /terasort-output
Re: Writing output from streaming task without dealing with key/value
If you don't want key in the final output, you can set like this in Java. job.setOutputKeyClass(NullWritable.class); It will just print the value in the output file. I don't how to do it in python. On 9/10/14, Dmitry Sivachenko trtrmi...@gmail.com wrote: Hello! Imagine the following common task: I want to process big text file line-by-line using streaming interface. Run unix grep command for instance. Or some other line-by-line processing, e.g. line.upper(). I copy file to HDFS. Then I run a map task on this file which reads one line, modifies it some way and then writes it to the output. TextInputFormat suites well for reading: it's key is the offset in bytes (meaningless in my case) and the value is the line itself, so I can iterate over line like this (in python): for line in sys.stdin: print(line.upper()) The problem arises with TextOutputFormat: It tries to split the resulting line on mapreduce.output.textoutputformat.separator which results in extra separator in output if this character is missing in the line, for instance (extra TAB at the end if we stick to defaults). Is there any way to write the result of streaming task without any internal processing so it appears exactly as the script produces it? If it is impossible with Hadoop, which works with key/value pairs, may be there are other frameworks which work on top of HDFS which allow to do this? Thanks in advance!
Re: Writing output from streaming task without dealing with key/value
In python, or any streaming program just set the output value to the empty string and you will get something like key\t. On Wed, Sep 10, 2014 at 12:03 PM, Susheel Kumar Gadalay skgada...@gmail.com wrote: If you don't want key in the final output, you can set like this in Java. job.setOutputKeyClass(NullWritable.class); It will just print the value in the output file. I don't how to do it in python. On 9/10/14, Dmitry Sivachenko trtrmi...@gmail.com wrote: Hello! Imagine the following common task: I want to process big text file line-by-line using streaming interface. Run unix grep command for instance. Or some other line-by-line processing, e.g. line.upper(). I copy file to HDFS. Then I run a map task on this file which reads one line, modifies it some way and then writes it to the output. TextInputFormat suites well for reading: it's key is the offset in bytes (meaningless in my case) and the value is the line itself, so I can iterate over line like this (in python): for line in sys.stdin: print(line.upper()) The problem arises with TextOutputFormat: It tries to split the resulting line on mapreduce.output.textoutputformat.separator which results in extra separator in output if this character is missing in the line, for instance (extra TAB at the end if we stick to defaults). Is there any way to write the result of streaming task without any internal processing so it appears exactly as the script produces it? If it is impossible with Hadoop, which works with key/value pairs, may be there are other frameworks which work on top of HDFS which allow to do this? Thanks in advance! -- *Kernighan's Law* Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
Re: Writing output from streaming task without dealing with key/value
On 10 сент. 2014 г., at 22:05, Rich Haase rdha...@gmail.com wrote: In python, or any streaming program just set the output value to the empty string and you will get something like key\t. I see, but I want to use many existing programs (like UNIX grep), and I don't want to have and extra \t in the output. Is there any way to achieve this? Or may be it is possible to write custom XxxOutputFormat to workaround that issue? (something opposite to TextInputFormat: it passes input line without any modification to script's stdin, there should be a way to write stdout to file as is). Thanks! On Wed, Sep 10, 2014 at 12:03 PM, Susheel Kumar Gadalay skgada...@gmail.com wrote: If you don't want key in the final output, you can set like this in Java. job.setOutputKeyClass(NullWritable.class); It will just print the value in the output file. I don't how to do it in python. On 9/10/14, Dmitry Sivachenko trtrmi...@gmail.com wrote: Hello! Imagine the following common task: I want to process big text file line-by-line using streaming interface. Run unix grep command for instance. Or some other line-by-line processing, e.g. line.upper(). I copy file to HDFS. Then I run a map task on this file which reads one line, modifies it some way and then writes it to the output. TextInputFormat suites well for reading: it's key is the offset in bytes (meaningless in my case) and the value is the line itself, so I can iterate over line like this (in python): for line in sys.stdin: print(line.upper()) The problem arises with TextOutputFormat: It tries to split the resulting line on mapreduce.output.textoutputformat.separator which results in extra separator in output if this character is missing in the line, for instance (extra TAB at the end if we stick to defaults). Is there any way to write the result of streaming task without any internal processing so it appears exactly as the script produces it? If it is impossible with Hadoop, which works with key/value pairs, may be there are other frameworks which work on top of HDFS which allow to do this? Thanks in advance! -- Kernighan's Law Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
Re: Writing output from streaming task without dealing with key/value
You can write a custom output format, or you can write your mapreduce job in Java and use a NullWritable as Susheel recommended. grep (and every other *nix text processing command) I can think of would not be limited by a trailing tab character. It's even quite easy to strip away that tab character if you don't want it during the post processing steps you want to perform with *nix commands. On Wed, Sep 10, 2014 at 12:12 PM, Dmitry Sivachenko trtrmi...@gmail.com wrote: On 10 сент. 2014 г., at 22:05, Rich Haase rdha...@gmail.com wrote: In python, or any streaming program just set the output value to the empty string and you will get something like key\t. I see, but I want to use many existing programs (like UNIX grep), and I don't want to have and extra \t in the output. Is there any way to achieve this? Or may be it is possible to write custom XxxOutputFormat to workaround that issue? (something opposite to TextInputFormat: it passes input line without any modification to script's stdin, there should be a way to write stdout to file as is). Thanks! On Wed, Sep 10, 2014 at 12:03 PM, Susheel Kumar Gadalay skgada...@gmail.com wrote: If you don't want key in the final output, you can set like this in Java. job.setOutputKeyClass(NullWritable.class); It will just print the value in the output file. I don't how to do it in python. On 9/10/14, Dmitry Sivachenko trtrmi...@gmail.com wrote: Hello! Imagine the following common task: I want to process big text file line-by-line using streaming interface. Run unix grep command for instance. Or some other line-by-line processing, e.g. line.upper(). I copy file to HDFS. Then I run a map task on this file which reads one line, modifies it some way and then writes it to the output. TextInputFormat suites well for reading: it's key is the offset in bytes (meaningless in my case) and the value is the line itself, so I can iterate over line like this (in python): for line in sys.stdin: print(line.upper()) The problem arises with TextOutputFormat: It tries to split the resulting line on mapreduce.output.textoutputformat.separator which results in extra separator in output if this character is missing in the line, for instance (extra TAB at the end if we stick to defaults). Is there any way to write the result of streaming task without any internal processing so it appears exactly as the script produces it? If it is impossible with Hadoop, which works with key/value pairs, may be there are other frameworks which work on top of HDFS which allow to do this? Thanks in advance! -- Kernighan's Law Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- *Kernighan's Law* Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
Re: Writing output from streaming task without dealing with key/value
On 10 сент. 2014 г., at 22:19, Rich Haase rdha...@gmail.com wrote: You can write a custom output format Any clues how can this can be done? , or you can write your mapreduce job in Java and use a NullWritable as Susheel recommended. grep (and every other *nix text processing command) I can think of would not be limited by a trailing tab character. It's even quite easy to strip away that tab character if you don't want it during the post processing steps you want to perform with *nix commands. Problem is that the line itself contains a TAB in the middle, there will not be extra trailing TAB at the end. So it is not that simple. You never know if it is a TAB from the original line or it is extra TAB added by TextOutputFormat. Thanks!
Re: Writing output from streaming task without dealing with key/value
Examples (the top ones are related to streaming jobs): http://www.infoq.com/articles/HadoopOutputFormat http://research.neustar.biz/2011/08/30/custom-inputoutput-formats-in-hadoop-streaming/ http://stackoverflow.com/questions/12759651/how-to-override-inputformat-and-outputformat-in-hadoop-application Regards, Shahab On Wed, Sep 10, 2014 at 2:28 PM, Dmitry Sivachenko trtrmi...@gmail.com wrote: On 10 сент. 2014 г., at 22:19, Rich Haase rdha...@gmail.com wrote: You can write a custom output format Any clues how can this can be done? , or you can write your mapreduce job in Java and use a NullWritable as Susheel recommended. grep (and every other *nix text processing command) I can think of would not be limited by a trailing tab character. It's even quite easy to strip away that tab character if you don't want it during the post processing steps you want to perform with *nix commands. Problem is that the line itself contains a TAB in the middle, there will not be extra trailing TAB at the end. So it is not that simple. You never know if it is a TAB from the original line or it is extra TAB added by TextOutputFormat. Thanks!
Re: Writing output from streaming task without dealing with key/value
10 сент. 2014 г., в 22:47, Shahab Yunus shahab.yu...@gmail.com написал(а): Examples (the top ones are related to streaming jobs): http://www.infoq.com/articles/HadoopOutputFormat http://research.neustar.biz/2011/08/30/custom-inputoutput-formats-in-hadoop-streaming/ http://stackoverflow.com/questions/12759651/how-to-override-inputformat-and-outputformat-in-hadoop-application Thanks for the links. Problem is that in RecordWriter() I get two parameters: key and value. If one of them is empty I have no way to tell if I should output the delimiter (because it was present in the original line) or not. What is the proper way to workaround that isuue? Regards, Shahab On Wed, Sep 10, 2014 at 2:28 PM, Dmitry Sivachenko trtrmi...@gmail.com wrote: On 10 сент. 2014 г., at 22:19, Rich Haase rdha...@gmail.com wrote: You can write a custom output format Any clues how can this can be done? , or you can write your mapreduce job in Java and use a NullWritable as Susheel recommended. grep (and every other *nix text processing command) I can think of would not be limited by a trailing tab character. It's even quite easy to strip away that tab character if you don't want it during the post processing steps you want to perform with *nix commands. Problem is that the line itself contains a TAB in the middle, there will not be extra trailing TAB at the end. So it is not that simple. You never know if it is a TAB from the original line or it is extra TAB added by TextOutputFormat. Thanks!
Re: Writing output from streaming task without dealing with key/value
Use ‘tr -s’ to stripe out tabs? $ echo -e a\t\t\tb a b $ echo -e a\t\t\tb | tr -s \t a b On Sep 10, 2014, at 11:28 AM, Dmitry Sivachenko trtrmi...@gmail.com wrote: On 10 сент. 2014 г., at 22:19, Rich Haase rdha...@gmail.com wrote: You can write a custom output format Any clues how can this can be done? , or you can write your mapreduce job in Java and use a NullWritable as Susheel recommended. grep (and every other *nix text processing command) I can think of would not be limited by a trailing tab character. It's even quite easy to strip away that tab character if you don't want it during the post processing steps you want to perform with *nix commands. Problem is that the line itself contains a TAB in the middle, there will not be extra trailing TAB at the end. So it is not that simple. You never know if it is a TAB from the original line or it is extra TAB added by TextOutputFormat. Thanks!
Re: Writing output from streaming task without dealing with key/value
If you don’t want anything get inserted, just set your output to key only or value only. TextOutputFormat$LineRecordWriter won’t insert anything unless both values are set: public synchronized void write(K key, V value) throws IOException { boolean nullKey = key == null || key instanceof NullWritable; boolean nullValue = value == null || value instanceof NullWritable; if (nullKey nullValue) { return; } if (!nullKey) { writeObject(key); } if (!(nullKey || nullValue)) { out.write(keyValueSeparator); } if (!nullValue) { writeObject(value); } out.write(newline); } On Sep 10, 2014, at 1:37 PM, Dmitry Sivachenko trtrmi...@gmail.com wrote: On 10 сент. 2014 г., at 22:33, Felix Chern idry...@gmail.com wrote: Use ‘tr -s’ to stripe out tabs? $ echo -e a\t\t\tb ab $ echo -e a\t\t\tb | tr -s \t ab There can be tabs in the input, I want to keep input lines without any modification. Actually it is rather standard task: process lines one by one without inserting extra characters. There should be standard solution for it IMO.
Re: Writing output from streaming task without dealing with key/value
On 11 сент. 2014 г., at 0:47, Felix Chern idry...@gmail.com wrote: If you don’t want anything get inserted, just set your output to key only or value only. TextOutputFormat$LineRecordWriter won’t insert anything unless both values are set: If I output value only, for instance, and my line contains TAB then everything before TAB will be lost? If I output key only, and my line contains TAB then everything after TAB will be lost? public synchronized void write(K key, V value) throws IOException { boolean nullKey = key == null || key instanceof NullWritable; boolean nullValue = value == null || value instanceof NullWritable; if (nullKey nullValue) { return; } if (!nullKey) { writeObject(key); } if (!(nullKey || nullValue)) { out.write(keyValueSeparator); } if (!nullValue) { writeObject(value); } out.write(newline); } On Sep 10, 2014, at 1:37 PM, Dmitry Sivachenko trtrmi...@gmail.com wrote: On 10 сент. 2014 г., at 22:33, Felix Chern idry...@gmail.com wrote: Use ‘tr -s’ to stripe out tabs? $ echo -e a\t\t\tb a b $ echo -e a\t\t\tb | tr -s \t a b There can be tabs in the input, I want to keep input lines without any modification. Actually it is rather standard task: process lines one by one without inserting extra characters. There should be standard solution for it IMO.
Re: Regular expressions in fs paths?
I solved this in the end by using a shell script (initiated by an oozie shell action) to use grep and loop through the results - didn't have to use -v option, as the -e option gives you access to a fuller range of regular expression functionality. Thanks for your help (again!) Rich. Charles On 10 September 2014 16:50, Rich Haase rdha...@gmail.com wrote: HDFS doesn't support he full range of glob matching you will find in Linux. If you want to exclude all files from a directory listing that meet a certain criteria try doing your listing and using grep -v to exclude the matching records.
The running job is blocked for a while if the queue is short of resources
Hi experts, I faced one strange issue I cannot understand, can you guys tell me if this is a bug or I configured something wrong. Below is my situation. I'm running with Hadopp 2.2.0 release and all my jobs are uberized, each node only can run a single job at a point of time, I used CapacityScheduler and configured 2 queues(default and small), I only give 5% capacity(10 nodes) to the small queue. What I found is the thoughput of the small queue is very poor if it's under heavy load( the inflow rate processing speed), I checked the log of the job, found out each job takes extra 1- 2 minutes in job commit phase, see below log 2014-09-10 14:01:13,665 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Status update from attempt_1410336300553_9902_m_00_0 2014-09-10 14:01:13,665 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Progress of TaskAttemptattempt_1410336300553_9902_m_00_0 is : 1.0 2014-09-10 14:01:13,670 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Done acknowledgement from attempt_1410336300553_9902_m_00_0 2014-09-10 14:01:13,670 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.Task: Task 'attempt_1410336300553_9902_m_00_0' done. 2014-09-10 14:01:13,671 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1410336300553_9902_m_00_0 TaskAttempt Transitioned from RUNNING to SUCCESS_CONTAINER_CLEANUP 2014-09-10 14:01:13,671 INFO [uber-SubtaskRunner] org.apache.hadoop.mapred.LocalContainerLauncher: Processing the event EventType: CONTAINER_REMOTE_CLEANUP for container container_1410336300553_9902_01_01 taskAttempt attempt_1410336300553_9902_m_00_0 2014-09-10 14:01:13,675 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: attempt_1410336300553_9902_m_00_0 TaskAttempt Transitioned from SUCCESS_CONTAINER_CLEANUP to SUCCEEDED 2014-09-10 14:01:13,685 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: Task succeeded with attempt attempt_1410336300553_9902_m_00_0 2014-09-10 14:01:13,687 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.TaskImpl: task_1410336300553_9902_m_00 Task Transitioned from RUNNING to SUCCEEDED 2014-09-10 14:01:13,693 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Num completed Tasks: 1 2014-09-10 14:01:13,694 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.TIEMRAppMetrics: task is completed on 2014-09-10 14:01:13,697 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1410336300553_9902Job Transitioned from RUNNING to COMMITTING 2014-09-10 14:01:13,697 INFO [CommitterEvent Processor #1] org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler: Processing the event EventType: JOB_COMMIT 2014-09-10 14:02:30,121 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: Calling handler for JobFinishedEvent 2014-09-10 14:02:30,122 INFO [AsyncDispatcher event handler] org.apache.hadoop.mapreduce.v2.app.job.impl.JobImpl: job_1410336300553_9902Job Transitioned from COMMITTING to SUCCEEDED As you can see the job commit started at 14:01:13 and ended at 14:02:30, it took a lot of time, I also captured the thread dump of the job(AppMaster), the interesting part is here CommitterEvent Processor #1 id=91 idx=0x16c tid=29593 prio=5 alive, waiting, native_blocked -- Waiting for notification on: org/apache/hadoop/mapreduce/v2/app/commit/CommitterEventHandler$EventProcessor0x906b46d0[fat lock] at jrockit/vm/Threads.waitForNotifySignal(JLjava/lang/Object;)Z(Native Method) at java/lang/Object.wait(J)V(Native Method) at java/lang/Object.wait(Object.java:485) at org/apache/hadoop/mapreduce/v2/app/commit/CommitterEventHandler$EventProcessor.waitForValidCommitWindow(CommitterEventHandler.java:313) ^-- Lock released while waiting: org/apache/hadoop/mapreduce/v2/app/commit/CommitterEventHandler$EventProcessor0x906b46d0[fat lock] at org/apache/hadoop/mapreduce/v2/app/commit/CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:252) at org/apache/hadoop/mapreduce/v2/app/commit/CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:216) at java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java/lang/Thread.run(Thread.java:662) at jrockit/vm/RNI.c2java(J)V(Native Method) -- end of trace I checked the code, it got blocked and waiting for the heartbeat to RM, and also I checked org.apache.hadoop.mapreduce.v2.app.local.LocalContainerAllocator.heartbeat() it seems sending another resource_allocate request to RM. So my understanding(correct me if wrong) is if the
Balancing is very slow.
hadoop 2.4.1 Balancing is very slow. $HADOOP_PREFIX/bin/hdfs dfsadmin -setBalancerBandwidth 52428800 It takes long time to move the one block. 2014. 09. 11. 11:38:01 Block begins to move 2014-09-11 11:47:20 Complete block move #10.2.1.211 netstat, Block begins to move, 10.2.1.210 --gt;gt;gt; 10.2.1.211 2014. 09. 11. 11:38:01 tcp 1110650 0 10.2.1.211:5681910.2.1.210:40010 ESTABLISHED - # datanode log, 10.2.1.211 2014-09-11 11:47:09,819 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Copied BP-1770955034-0.0.0.0-1401163460236:blk_1077753386_4013196 to /10.2.1.211:56819 # namenode balancer log 2014-09-11 11:47:20,782 INFO org.apache.hadoop.hdfs.server.balancer.Balancer: Successfully moved blk_1077753386_4013196 with size=134217728 from 10.2.1.204:40010 to 10.2.1.211:40010 through 10.2.1.210:40010 # check network state, File transfer speed using scp, 76.7MB/sdummy.tar 100% 230MB 76.7MB/s 00:03