Re: How to upgrade Hadoop 2.2 to 2.4

2014-06-24 Thread Stanley Shi
Cluster-running upgrade is only supported after 2.4, that is, upgrade from
2.4 to 2.4+ is supported; upgrading from 2.2 to 2.4 is not supported;

Regards,
*Stanley Shi,*



On Fri, Jun 20, 2014 at 5:50 PM, Jason Meng neu...@126.com wrote:

 Hi,

 I setup Hadoop 2.2 cluster with NameNode HA. How to upgrade it to 2.4? I
 try to upgrade according to
 http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html,
 but can't works. Is there a detailed step by step guide include when to
 shutdown namenode or when to change env variable like HADOOP_HOME in which
 step? There are two mode to upgrade, cluster running or cluster stopped.
 Either one is ok. Thanks.

 Regards,
 Jason





Re: grouping similar items toegther

2014-06-24 Thread Stanley Shi
The similar logic is not transitive, that means, if a is similar to b, b
is similar to c, but a may be not similar to c;
then how do you do the group?

Regards,
*Stanley Shi,*



On Sat, Jun 21, 2014 at 2:51 AM, parnab kumar parnab.2...@gmail.com wrote:

 Hi,

 I have a set of hashes. Each Hash is a 32 bit Long Integer. Two hashes
 are similar if their corresponding hamming distance is less than equal to 2.

 I need to group together hashes that are mutually similar to one another
 i.e in the output file in each line i should have mutually similar keys.

 I implemented a customer writable and the compareTo method looks  as
 follows :

 *public int compareTo(Object o) {*
 * Long thisHash = this.hash*
 * Long thatHash = ((DocumentHash)o).hash.;*
 * if(hammingDist(thisHash, thatHash)=2){*
 * return 0;*
 * }*
  * return thisHash.compareTo(thatHash);*
 * }*


 In the Map function I emit the customWritable as the key and in the reduce
 group by the keys.

 I checked the output file and exhaustively tested the hashes manually and
 found that most hashes are mutually similar in each line. However, i found
 that some hashes even though they are similar to a group are not in the
 output.

 For example: consider the following hashes :

 HASH1 = 69215512
 HASH2 =  69215512
 HASH3 =  69215512
 HASH4 = 69215568

 All the above 4 hashes are mutually similar and are within a distance 2 of
 each other. Still in the output file i found two separate records where
 HASH1 and HASH2 occurs in one line and HASH3 and HASH4 occurs in other line
 as follows:

 HASH4HASH3
 HASH1HASH2


 Can someone specify why the above happens ???


 Thanks,
 Parnab.





Re:Re: How to upgrade Hadoop 2.2 to 2.4

2014-06-24 Thread Jason Meng
Thanks. I see.
 Found 
http://www.queryhome.com/45727/how-to-upgrade-from-hadoop-2-3-to-hadoop-2-4 to 
upgrade when stopped.







在 2014-06-24 02:04:19,Stanley Shi s...@gopivotal.com 写道:

Cluster-running upgrade is only supported after 2.4, that is, upgrade from 2.4 
to 2.4+ is supported; upgrading from 2.2 to 2.4 is not supported;


Regards,
Stanley Shi,





On Fri, Jun 20, 2014 at 5:50 PM, Jason Meng neu...@126.com wrote:

Hi,

I setup Hadoop 2.2 cluster with NameNode HA. How to upgrade it to 2.4? I try to 
upgrade according to 
http://hadoop.apache.org/docs/r2.4.0/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html,
 but can't works. Is there a detailed step by step guide include when to 
shutdown namenode or when to change env variable like HADOOP_HOME in which 
step? There are two mode to upgrade, cluster running or cluster stopped. Either 
one is ok. Thanks.

Regards,
Jason







Finding mamimum value in reducer

2014-06-24 Thread unmesha sreeveni
I have a scenario.

Output from previous job1 is http://pastebin.com/ADa8fTGB.

In next job2 I need to get/find i key having maximum value.

eg i=3, 3 keys having maximum value.
(i will be a custom parameter)

How to approach this.

Should we calculated max() in job2 mapper as there will be unique keys(as
the output is coming from previous reducer)

or

find max in second jobs reducer.But again how to find i keys?

I tried in this way
Instead of emiting value as value in reducer.I emitted value as key so I
can get the values in ascending order. And I wrote the next MR job.where
mapper simply emits the key/value.

Reducer finds the max of key But again I am stuck that cannot be done as we
try to get the id , because id is only unique,Values are not uniqe

How to solve this.

-- 
*Thanks  Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Center for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/


UNSUBSCRIBE

2014-06-24 Thread ankisetty kumar



UNSUBSCRIBE

2014-06-24 Thread Don Hilborn


Don Hilborn  Solutions Engineer, Hortonworks
Mobile: 832-444-5463
Email: dhilb...@hortonworks.com
Website: http://www.hortonworks.com/


Hortonworks enables your modern data architecture.


 On Jun 24, 2014, at 4:23 AM, ankisetty kumar ankisetty.ku...@gmail.com 
 wrote:
 
 

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: UNSUBSCRIBE

2014-06-24 Thread Ted Yu
Please send email to user-unsubscr...@hadoop.apache.org

Cheers

On Jun 24, 2014, at 2:47 AM, Don Hilborn dhilb...@hortonworks.com wrote:

 
 
 Don Hilborn  Solutions Engineer, Hortonworks
 Mobile: 832-444-5463
 Email: dhilb...@hortonworks.com
 Website: http://www.hortonworks.com/
 
 
 Hortonworks enables your modern data architecture.
 
 
 On Jun 24, 2014, at 4:23 AM, ankisetty kumar ankisetty.ku...@gmail.com 
 wrote:
 
 
 CONFIDENTIALITY NOTICE
 NOTICE: This message is intended for the use of the individual or entity to 
 which it is addressed and may contain information that is confidential, 
 privileged and exempt from disclosure under applicable law. If the reader of 
 this message is not the intended recipient, you are hereby notified that any 
 printing, copying, dissemination, distribution, disclosure or forwarding of 
 this communication is strictly prohibited. If you have received this 
 communication in error, please contact the sender immediately and delete it 
 from your system. Thank You.


MR job failed due to java.io.FileNotFoundException, but the path for ${mapreduce.jobhistory.done-dir} is not correct

2014-06-24 Thread Anfernee Xu
Hi,

I'm running Hadoop 2.2.0, and occasionally some my MR jobs failed due to
below error.

The issue is the job was running on 2014-06-24, but the path was pointed to
/2014/06/01, do you guys know what's going on here?

2014-06-24 08:04:28.170 -0700 [pool-1-thread-157] java.io.IOException:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.yarn.exceptions.YarnRuntimeException):
java.io.FileNotFoundException: File
/tmp/hadoop-yarn/staging/history/done/2014/06/01/59 does not exist.
at
org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.getFullJob(CachedHistoryStorage.java:122)
at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getJob(JobHistory.java:207)
at
org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler$1.run(HistoryClientService.java:200)
at
org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler$1.run(HistoryClientService.java:196)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at
org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.verifyAndGetJob(HistoryClientService.java:196)
at
org.apache.hadoop.mapreduce.v2.hs.HistoryClientService$HSClientProtocolHandler.getJobReport(HistoryClientService.java:228)
at
org.apache.hadoop.mapreduce.v2.api.impl.pb.service.MRClientProtocolPBServiceImpl.getJobReport(MRClientProtocolPBServiceImpl.java:122)
at
org.apache.hadoop.yarn.proto.MRClientProtocol$MRClientProtocolService$2.callBlockingMethod(MRClientProtocol.java:275)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2048)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2042)
Caused by: java.io.FileNotFoundException: File
/tmp/hadoop-yarn/staging/history/done/2014/06/01/59 does not exist.
at org.apache.hadoop.fs.Hdfs$DirListingIterator.init(Hdfs.java:205)
at org.apache.hadoop.fs.Hdfs$DirListingIterator.init(Hdfs.java:189)
at org.apache.hadoop.fs.Hdfs$2.init(Hdfs.java:171)
at org.apache.hadoop.fs.Hdfs.listStatusIterator(Hdfs.java:171)
at org.apache.hadoop.fs.FileContext$20.next(FileContext.java:1392)
at org.apache.hadoop.fs.FileContext$20.next(FileContext.java:1387)
at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
at org.apache.hadoop.fs.FileContext.listStatus(FileContext.java:1387)
at
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.scanDirectory(HistoryFileManager.java:655)
at
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.scanDirectoryForHistoryFiles(HistoryFileManager.java:668)
at
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.scanOldDirsForJob(HistoryFileManager.java:825)
at
org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.getFileInfo(HistoryFileManager.java:854)
at
org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.getFullJob(CachedHistoryStorage.java:107)
... 18 more

at
org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:331)
~[thirdeye-action.jar:na]
at
org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:416)
~[thirdeye-action.jar:na]
at
org.apache.hadoop.mapred.TIEYarnRunner.getJobStatus(TIEYarnRunner.java:534)
~[thirdeye-action.jar:na]
at org.apache.hadoop.mapreduce.Job$1.run(Job.java:314)
~[thirdeye-action.jar:na]
at org.apache.hadoop.mapreduce.Job$1.run(Job.java:311)
~[thirdeye-action.jar:na]
at java.security.AccessController.doPrivileged(Native Method) ~[na:1.6.0_23]
at javax.security.auth.Subject.doAs(Subject.java:396) ~[na:1.6.0_23]
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
~[thirdeye-action.jar:na]
at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:311)
~[thirdeye-action.jar:na]
at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:599)
~[thirdeye-action.jar:na]
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1294)
~[thirdeye-action.jar:na]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
[na:1.6.0_23]
at
java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
[na:1.6.0_23]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
[na:1.6.0_23]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
[na:1.6.0_23]
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
[na:1.6.0_23]
at

Connection refused

2014-06-24 Thread Mohit Anchlia
When I try to run a hdfs api program I get connection refused. However,
hadoop fs -ls and other commands work fine. I don't see anything wrong in
the application web ui.


[mohit@localhost eg]$ hadoop jar hadoop-labs-0.0.1-SNAPSHOT.jar
org.hadoop.qstride.lab.eg.HelloWorld
hdfs://localhost/user/mohit/helloworld.dat

14/06/24 22:11:12 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable

Exception in thread main java.net.ConnectException: Call From
localhost.localdomain/127.0.0.1 to localhost:8020 failed on connection
exception: java.net.ConnectException: Connection refused; For more details
see: http://wiki.apache.org/hadoop/ConnectionRefused



[mohit@localhost eg]$ hadoop fs -ls /user/mohit/eg

 Found 1 items

-rw-r--r-- 1 mohit hadoop 13 2014-06-24 21:34 /user/mohit/eg/helloworld.dat

[mohit@localhost eg]$