Re: How to upgrade Hadoop 2.2 to 2.4
Cluster-running upgrade is only supported after 2.4, that is, upgrade from 2.4 to 2.4+ is supported; upgrading from 2.2 to 2.4 is not supported; Regards, *Stanley Shi,* On Fri, Jun 20, 2014 at 5:50 PM, Jason Meng neu...@126.com wrote: Hi, I setup Hadoop 2.2 cluster with NameNode HA. How to upgrade it to 2.4? I try to upgrade according to http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html, but can't works. Is there a detailed step by step guide include when to shutdown namenode or when to change env variable like HADOOP_HOME in which step? There are two mode to upgrade, cluster running or cluster stopped. Either one is ok. Thanks. Regards, Jason
Re: grouping similar items toegther
The similar logic is not transitive, that means, if a is similar to b, b is similar to c, but a may be not similar to c; then how do you do the group? Regards, *Stanley Shi,* On Sat, Jun 21, 2014 at 2:51 AM, parnab kumar parnab.2...@gmail.com wrote: Hi, I have a set of hashes. Each Hash is a 32 bit Long Integer. Two hashes are similar if their corresponding hamming distance is less than equal to 2. I need to group together hashes that are mutually similar to one another i.e in the output file in each line i should have mutually similar keys. I implemented a customer writable and the compareTo method looks as follows : *public int compareTo(Object o) {* * Long thisHash = this.hash* * Long thatHash = ((DocumentHash)o).hash.;* * if(hammingDist(thisHash, thatHash)=2){* * return 0;* * }* * return thisHash.compareTo(thatHash);* * }* In the Map function I emit the customWritable as the key and in the reduce group by the keys. I checked the output file and exhaustively tested the hashes manually and found that most hashes are mutually similar in each line. However, i found that some hashes even though they are similar to a group are not in the output. For example: consider the following hashes : HASH1 = 69215512 HASH2 = 69215512 HASH3 = 69215512 HASH4 = 69215568 All the above 4 hashes are mutually similar and are within a distance 2 of each other. Still in the output file i found two separate records where HASH1 and HASH2 occurs in one line and HASH3 and HASH4 occurs in other line as follows: HASH4HASH3 HASH1HASH2 Can someone specify why the above happens ??? Thanks, Parnab.
Re:Re: How to upgrade Hadoop 2.2 to 2.4
Thanks. I see. Found http://www.queryhome.com/45727/how-to-upgrade-from-hadoop-2-3-to-hadoop-2-4 to upgrade when stopped. 在 2014-06-24 02:04:19,Stanley Shi s...@gopivotal.com 写道: Cluster-running upgrade is only supported after 2.4, that is, upgrade from 2.4 to 2.4+ is supported; upgrading from 2.2 to 2.4 is not supported; Regards, Stanley Shi, On Fri, Jun 20, 2014 at 5:50 PM, Jason Meng neu...@126.com wrote: Hi, I setup Hadoop 2.2 cluster with NameNode HA. How to upgrade it to 2.4? I try to upgrade according to http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html, but can't works. Is there a detailed step by step guide include when to shutdown namenode or when to change env variable like HADOOP_HOME in which step? There are two mode to upgrade, cluster running or cluster stopped. Either one is ok. Thanks. Regards, Jason
Finding mamimum value in reducer
I have a scenario. Output from previous job1 is http://pastebin.com/ADa8fTGB. In next job2 I need to get/find i key having maximum value. eg i=3, 3 keys having maximum value. (i will be a custom parameter) How to approach this. Should we calculated max() in job2 mapper as there will be unique keys(as the output is coming from previous reducer) or find max in second jobs reducer.But again how to find i keys? I tried in this way Instead of emiting value as value in reducer.I emitted value as key so I can get the values in ascending order. And I wrote the next MR job.where mapper simply emits the key/value. Reducer finds the max of key But again I am stuck that cannot be done as we try to get the id , because id is only unique,Values are not uniqe How to solve this. -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Center for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/
UNSUBSCRIBE
UNSUBSCRIBE
Don Hilborn Solutions Engineer, Hortonworks Mobile: 832-444-5463 Email: dhilb...@hortonworks.com Website: http://www.hortonworks.com/ Hortonworks enables your modern data architecture. On Jun 24, 2014, at 4:23 AM, ankisetty kumar ankisetty.ku...@gmail.com wrote: -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: UNSUBSCRIBE
Please send email to user-unsubscr...@hadoop.apache.org Cheers On Jun 24, 2014, at 2:47 AM, Don Hilborn dhilb...@hortonworks.com wrote: Don Hilborn Solutions Engineer, Hortonworks Mobile: 832-444-5463 Email: dhilb...@hortonworks.com Website: http://www.hortonworks.com/ Hortonworks enables your modern data architecture. On Jun 24, 2014, at 4:23 AM, ankisetty kumar ankisetty.ku...@gmail.com wrote: CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
MR job failed due to java.io.FileNotFoundException, but the path for ${mapreduce.jobhistory.done-dir} is not correct
Hi, I'm running Hadoop 2.2.0, and occasionally some my MR jobs failed due to below error. The issue is the job was running on 2014-06-24, but the path was pointed to /2014/06/01, do you guys know what's going on here? 2014-06-24 08:04:28.170 -0700 [pool-1-thread-157] java.io.IOException: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.YarnRuntimeException): java.io.FileNotFoundException: File /tmp/hadoop-yarn/staging/history/done/2014/06/01/59 does not exist. at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.getFullJob(CachedHistoryStorage.java:122) at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:207) at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler$1.run(HistoryClientService.java:200) at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler$1.run(HistoryClientService.java:196) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.verifyAndGetJob(HistoryClientService.java:196) at org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getJobReport(HistoryClientService.java:228) at org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getJobReport(MRClientProtocolPBServiceImpl.java:122) at org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.callBlockingMethod(MRClientProtocol.java:275) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042) Caused by: java.io.FileNotFoundException: File /tmp/hadoop-yarn/staging/history/done/2014/06/01/59 does not exist. at org.apache.hadoop.fs.Hdfs$DirListingIterator.init(Hdfs.java:205) at org.apache.hadoop.fs.Hdfs$DirListingIterator.init(Hdfs.java:189) at org.apache.hadoop.fs.Hdfs$2.init(Hdfs.java:171) at org.apache.hadoop.fs.Hdfs.listStatusIterator(Hdfs.java:171) at org.apache.hadoop.fs.FileContext$20.next(FileContext.java:1392) at org.apache.hadoop.fs.FileContext$20.next(FileContext.java:1387) at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90) at org.apache.hadoop.fs.FileContext.listStatus(FileContext.java:1387) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.scanDirectory(HistoryFileManager.java:655) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.scanDirectoryForHistoryFiles(HistoryFileManager.java:668) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.scanOldDirsForJob(HistoryFileManager.java:825) at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.getFileInfo(HistoryFileManager.java:854) at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.getFullJob(CachedHistoryStorage.java:107) ... 18 more at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:331) ~[thirdeye-action.jar:na] at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:416) ~[thirdeye-action.jar:na] at org.apache.hadoop.mapred.TIEYarnRunner.getJobStatus(TIEYarnRunner.java:534) ~[thirdeye-action.jar:na] at org.apache.hadoop.mapreduce.Job$1.run(Job.java:314) ~[thirdeye-action.jar:na] at org.apache.hadoop.mapreduce.Job$1.run(Job.java:311) ~[thirdeye-action.jar:na] at java.security.AccessController.doPrivileged(Native Method) ~[na:1.6.0_23] at javax.security.auth.Subject.doAs(Subject.java:396) ~[na:1.6.0_23] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) ~[thirdeye-action.jar:na] at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311) ~[thirdeye-action.jar:na] at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:599) ~[thirdeye-action.jar:na] at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1294) ~[thirdeye-action.jar:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) [na:1.6.0_23] at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) [na:1.6.0_23] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) [na:1.6.0_23] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) [na:1.6.0_23] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) [na:1.6.0_23] at
Connection refused
When I try to run a hdfs api program I get connection refused. However, hadoop fs -ls and other commands work fine. I don't see anything wrong in the application web ui. [mohit@localhost eg]$ hadoop jar hadoop-labs-0.0.1-SNAPSHOT.jar org.hadoop.qstride.lab.eg.HelloWorld hdfs://localhost/user/mohit/helloworld.dat 14/06/24 22:11:12 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Exception in thread main java.net.ConnectException: Call From localhost.localdomain/127.0.0.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused [mohit@localhost eg]$ hadoop fs -ls /user/mohit/eg Found 1 items -rw-r--r-- 1 mohit hadoop 13 2014-06-24 21:34 /user/mohit/eg/helloworld.dat [mohit@localhost eg]$