Re: metric type

2013-08-30 Thread lei liu
There is @Metric MutableCounterLong bytesWritten attribute in DataNodeMetrics, which is used to IO/sec statistics? 2013/8/31 Jitendra Yadav > Hi, > > For IO/sec statistics I think MutableCounterLongRate and > MutableCounterLong more useful than others and for xceiver thread > number I'm not bi

Re: Job config before read fields

2013-08-30 Thread Shahab Yunus
What I meant was that you might have to split or redesign your logic or your usecase (which we don't know about)? Regards, Shahab On Fri, Aug 30, 2013 at 10:31 PM, Adrian CAPDEFIER wrote: > But how would the comparator have access to the job config? > > > On Sat, Aug 31, 2013 at 2:38 AM, Shahab

Re: Job config before read fields

2013-08-30 Thread Adrian CAPDEFIER
But how would the comparator have access to the job config? On Sat, Aug 31, 2013 at 2:38 AM, Shahab Yunus wrote: > I think you have to override/extend the Comparator to achieve that, > something like what is done in Secondary Sort? > > Regards, > Shahab > > > On Fri, Aug 30, 2013 at 9:01 PM, Adr

Re: Multidata center support

2013-08-30 Thread Jun Ping Du
Hi, Although you can set datacenter layer on your network topology, it is never enabled in hadoop as lacking of replica placement and task scheduling support. There are some work to add layers other than rack and node under HADOOP-8848 but may not suit for your case. Agree with Adam that a clus

Re: Job config before read fields

2013-08-30 Thread Shahab Yunus
I think you have to override/extend the Comparator to achieve that, something like what is done in Secondary Sort? Regards, Shahab On Fri, Aug 30, 2013 at 9:01 PM, Adrian CAPDEFIER wrote: > Howdy, > > I apologise for the lack of code in this message, but the code is fairly > convoluted and it w

Job config before read fields

2013-08-30 Thread Adrian CAPDEFIER
Howdy, I apologise for the lack of code in this message, but the code is fairly convoluted and it would obscure my problem. That being said, I can put together some sample code if really needed. I am trying to pass some metadata between the map & reduce steps. This metadata is read and generated

Re: metric type

2013-08-30 Thread Jitendra Yadav
Hi, For IO/sec statistics I think MutableCounterLongRate and MutableCounterLong more useful than others and for xceiver thread number I'm not bit sure right now. Thanks Jiitendra On Fri, Aug 30, 2013 at 1:40 PM, lei liu wrote: > > Hi  Jitendra, > If I want to statistics number of bytes read per

Re: 答复: Hadoop HA error "JOURNAL is not supported in state standby"

2013-08-30 Thread orahad bigdata
Hi, Many thanks to everyone. Now issue got resolve after changing client version. Regards On Fri, Aug 30, 2013 at 1:12 PM, Francis.Hu wrote: > Did you start up your ZKFC service on both of your name nodes ? > > Thanks, > Francis.Hu > > -邮件原件- > 发件人: orahad bigdata [mailto:oracle...@gmai

Re: InvalidProtocolBufferException while submitting crunch job to cluster

2013-08-30 Thread Narlin M
Looks like I was pointing to incorrect ports. After correcting the port numbers, conf.set("fs.defaultFS", "hdfs://:8020"); conf.set("mapred.job.tracker", ":8021"); I am now getting the following exception: 2880 [Thread-15] INFO org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJ

InvalidProtocolBufferException while submitting crunch job to cluster

2013-08-30 Thread Narlin M
I am getting following exception while trying to submit a crunch pipeline job to a remote hadoop cluster: Exception in thread "main" java.lang.RuntimeException: Cannot create job output directory /tmp/crunch-324987940 at org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)

Re: hadoop 2.0.5 datanode heartbeat issue

2013-08-30 Thread Jitendra Yadav
Hi, However your conf looks fine but I would say that you should restart your DN once and check your NN weburl. Regards Jitendra On 8/31/13, orahad bigdata wrote: > here is my conf files. > > ---core-site.xml--- > > > fs.defaultFS > hdfs://orahadoop > > > dfs.journaln

Re: hadoop 2.0.5 datanode heartbeat issue

2013-08-30 Thread orahad bigdata
Thanks Jitendra, I have restarted my DataNode and suddenly it works for me :) now it's connected to both NN's. Do you know why this issue occurred? Thanks On Sat, Aug 31, 2013 at 1:24 AM, Jitendra Yadav wrote: > Hi, > > However your conf looks fine but I would say that you should restart >

Re: reduce job hung in pending state: "No room for reduce task"

2013-08-30 Thread Jitendra Yadav
Hi, Did you checked the free disk space on server where your reducer task was running? because it need approx. 264gb free disk space to run(as per logs). Thanks Jitendra On 8/30/13, Jim Colestock wrote: > Hello All, > > We're running into the following 2 bugs again: > https://issues.apache.org/j

Re: hadoop 2.0.5 datanode heartbeat issue

2013-08-30 Thread Jing Zhao
Another possibility I can imagine is that the old configuration property "fs.default.name" is still in your configuration with a single NN's host+ip as its value. In that case this bad value may overwrite the value of fs.defaultFS. It may be helpful if you can post your configurations. On Fri, Au

Re: hadoop 2.0.5 datanode heartbeat issue

2013-08-30 Thread orahad bigdata
here is my conf files. ---core-site.xml--- fs.defaultFS hdfs://orahadoop dfs.journalnode.edits.dir /u0/journal/node/local/data hdfs-site.xml- dfs.nameservices orahadoop dfs.ha.namenodes.orahadoop node1,node2 dfs.namenode.rpc-add

Re: hadoop 2.0.5 datanode heartbeat issue

2013-08-30 Thread orahad bigdata
Thanks Jing, I'm using same configuration files at datanode side. dfs.nameservices -> orahadoop (hdfs-site.xml) fs.defaultFS -> hdfs://orahadoop (core-site.xml) Thanks On 8/30/13, Jing Zhao wrote: > You may need to make sure the configuration of your DN has also been > updated for HA. If your

Re: hadoop 2.0.5 datanode heartbeat issue

2013-08-30 Thread Jing Zhao
You may need to make sure the configuration of your DN has also been updated for HA. If your DN's configuration still uses the old URL (e.g., one of your NN's host+port) for "fs.defaultFS", DN will only connect to that NN. On Fri, Aug 30, 2013 at 10:56 AM, orahad bigdata wrote: > Hi All, > > I'm

hadoop 2.0.5 datanode heartbeat issue

2013-08-30 Thread orahad bigdata
Hi All, I'm using Hadoop 2.0.5 HA with QJM, After starting the cluster I did some manual switch overs between NN.Then after I opened WEBUI page for both the NN, I saw some strange situation where my DN connected to standby NN but not sending the heartbeat to primary NameNode . please guide. Than

WritableComparable.compareTo vs RawComparator.compareTo

2013-08-30 Thread Adeel Qureshi
For secondary sort I am implementing a RawComparator and providing that as sortComparator .. is that the faster way or using a WritableComparable as mapper output and defining a compareTo method on the key itself also what happens if both are defined, is one ignored

"bad interpreter: Text file busy" and other errors in Hadoop 2.1.0-beta

2013-08-30 Thread Jian Fang
Hi, I upgraded to Hadoop 2.1.0-beta and suddenly I started to see error messages as follows. Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: bash: /var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1377823589199_0002/container_1377823589199_000

RE: secondary sort - number of reducers

2013-08-30 Thread java8964 java8964
Well, The reducers normally will take much longer than the mappers stage, because the copy/shuffle/sort all happened at this time, and they are the hard part. But before we simply say it is part of life, you need to dig into more of your MR jobs to find out if you can make it faster. You are the

Re: secondary sort - number of reducers

2013-08-30 Thread Adeel Qureshi
my secondary sort on multiple keys seem to work fine with smaller data sets but with bigger data sets (like 256 gig and 800M+ records) the mapper phase gets done pretty quick (about 15 mins) but then the reducer phase seem to take forever. I am using 255 reducers. basic idea is that my composite k

Re: Is hadoop tread safe?

2013-08-30 Thread Dinkar Sitaram
This comment (from http://stackoverflow.com/questions/12504690/how-to-run-hadoop-multithread-way-in-single-jvm ) may also be relevant: "Hadoop purposely does not run more than one task at the same time in one JVM for isolation purposes. And in stand-alone (local) mode, only one JVM is ever used. I

reduce job hung in pending state: "No room for reduce task"

2013-08-30 Thread Jim Colestock
Hello All, We're running into the following 2 bugs again: https://issues.apache.org/jira/browse/HADOOP-5241 https://issues.apache.org/jira/browse/MAPREDUCE-2324 Both of them a listed as closed fixed. (I was actually the one that got cloudera to submit MAPREDUCE-2324) Does anyone know is anyo

Re: secondary sort - number of reducers

2013-08-30 Thread Adeel Qureshi
yup it was negative and by doing this now it seems to be working fine On Fri, Aug 30, 2013 at 3:09 AM, Shekhar Sharma wrote: > Is the hash code of that key is negative.? > Do something like this > > return groupKey.hashCode() & Integer.MAX_VALUE % numParts; > > Regards, > Som Shekhar Sharma > +

Re: authentication when uploading in to hadoop HDFS

2013-08-30 Thread Larry McCay
Hi Visioner - Depending on your actual installation, you may have all of the other APIs available to the CLI clients as well. This would potentially be an valid usecase for Apache Knox - in the incubator still - see: http://knox.incubator.apache.org/ Knox provides you with a Web API Gateway for H

Re: authentication when uploading in to hadoop HDFS

2013-08-30 Thread Nitin Pawar
ssh has nothing to do with hdfs. there are three ways someone would want to write into hdfs 1) HDFS java api 2) hadoop command line tools 3) Webhdfs (doing post, put etc) In all above cases, there is no role of ssh. So you can assume that as long as no one has access to ssh-keys, no one can get i

Re: authentication when uploading in to hadoop HDFS

2013-08-30 Thread Visioner Sadak
Thanks a ton Nitin just wanted to confirm for the point below an external user wont be able to write in to our cluster using any API right as we didnt included his ip in our cluster using password less ssh for him i guess ssh will prompt a password for writes and reads correct me if i am wrong :)

Re: Multidata center support

2013-08-30 Thread Adam Muise
Nothing has changed. DR best practice is still one (or more) clusters per site and replication is handled via distributed copy or some variation of it. A cluster spanning multiple data centers is a poor idea right now. On Fri, Aug 30, 2013 at 12:35 AM, Rahul Bhattacharjee < rahul.rec@gmail.

Re: [yarn] job is not getting assigned

2013-08-30 Thread Andre Kelpe
Hi Vinod, I found the issue: The yarn.nodemanager.resource.memory-mb value was to low. I set it back to the default value and the job runs fine now. Thanks! - André On Thu, Aug 29, 2013 at 7:36 PM, Vinod Kumar Vavilapalli wrote: > > This usually means there are no available resources as seen

Re: metric type

2013-08-30 Thread lei liu
Hi Jitendra, If I want to statistics number of bytes read per second,and display the result into ganglia, should I use MutableCounterLong or MutableGaugeLong? If I want to display current xceiver thread number in datanode into ganglia, should I use MutableCounterLong or MutableGaugeLong? Thanks,

Re: Hadoop HA error "JOURNAL is not supported in state standby"

2013-08-30 Thread Jitendra Yadav
Hi, Totally agreed with Jing's reply, I faced the same issue previously, At that time I was doing cluster upgrade. However I upgraded all the nodes but in one of my node hdfs bin pointing to previous version, So I changed the PATH and it works fine for me. Thanks On Fri, Aug 30, 2013 at 2:10 AM,

答复: Hadoop HA error "JOURNAL is not supported in state standby"

2013-08-30 Thread Francis . Hu
Did you start up your ZKFC service on both of your name nodes ? Thanks, Francis.Hu -邮件原件- 发件人: orahad bigdata [mailto:oracle...@gmail.com] 发送时间: Friday, August 30, 2013 4:09 收件人: user 主题: Hadoop HA error "JOURNAL is not supported in state standby" Hi, I'm facing an error while starting

Re: metric type

2013-08-30 Thread Jitendra Yadav
Hi, Below link contains the answer for your question. http://hadoop.apache.org/docs/r1.2.0/api/org/apache/hadoop/metrics2/package-summary.html Regards Jitendra On Fri, Aug 30, 2013 at 11:35 AM, lei liu wrote: > I use the metrics v2, there are COUNTER and GAUGE metric type in metrics > v2. > W

Re: secondary sort - number of reducers

2013-08-30 Thread Shekhar Sharma
Is the hash code of that key is negative.? Do something like this return groupKey.hashCode() & Integer.MAX_VALUE % numParts; Regards, Som Shekhar Sharma +91-8197243810 On Fri, Aug 30, 2013 at 6:25 AM, Adeel Qureshi wrote: > okay so when i specify the number of reducers e.g. in my example i m

Re: authentication when uploading in to hadoop HDFS

2013-08-30 Thread Nitin Pawar
well have access to read from hdfs using webhdfs : ===>you may want to secure it with IP and username based authentications as of now we dunt have any security specific to hdfs user level we have se permissions=true for a particular user >if you are managing user level access control then it