bad interpreter: Text file busy and other errors in Hadoop 2.1.0-beta

2013-08-30 Thread Jian Fang
Hi, I upgraded to Hadoop 2.1.0-beta and suddenly I started to see error messages as follows. Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: bash:

Re: no jobtracker to stop,no namenode to stop

2013-08-30 Thread NJain
Hey Nikhil, Just tried what you asked for and yes there are files and folders in c:/Hadoop/name (folders: current, image, previous.checkpoint, in_use.lock) and also tried with the firewall is disabled. Just want to let you know one more thing that when on Jobtracker UI, I click on '0' under

namenode name dir

2013-08-30 Thread lei liu
I use QJM, do I need to config two directories for the dfs.namenode.name.dir, one local filesystem path and one NFS path? I think the Stadnby NameNode also store the fsimage, so I think I only need to config one local file system path. Thanks, LiuLei

metric type

2013-08-30 Thread lei liu
I use the metrics v2, there are COUNTER and GAUGE metric type in metrics v2. What is the difference between the two? Thanks, LiuLei

Re: namenode name dir

2013-08-30 Thread Harsh J
You are correct - a single directory should suffice. That said, if you still want multiple copies, you can continue to configure it that way. On Fri, Aug 30, 2013 at 11:32 AM, lei liu liulei...@gmail.com wrote: I use QJM, do I need to config two directories for the dfs.namenode.name.dir,

authentication when uploading in to hadoop HDFS

2013-08-30 Thread Visioner Sadak
Hello friends we use filesystem.copyFrmLocal method of java api within a tomcat conntainer to move data in to hadoop clusters, will any other unauthorised user will be able to write in to our hadoop cluster using the java api or is any extra authenticaton needed from our side

Re: authentication when uploading in to hadoop HDFS

2013-08-30 Thread Visioner Sadak
well have access to read from hdfs using webhdfs as of now we dunt have any security specific to hdfs user level we have se permissions=true for a particular user only admin has ssh access to linux clusters On Fri, Aug 30, 2013 at 12:14 PM, Nitin Pawar nitinpawar...@gmail.comwrote:

Re: authentication when uploading in to hadoop HDFS

2013-08-30 Thread Nitin Pawar
well have access to read from hdfs using webhdfs : ===you may want to secure it with IP and username based authentications as of now we dunt have any security specific to hdfs user level we have se permissions=true for a particular user if you are managing user level access control then it

Re: secondary sort - number of reducers

2013-08-30 Thread Shekhar Sharma
Is the hash code of that key is negative.? Do something like this return groupKey.hashCode() Integer.MAX_VALUE % numParts; Regards, Som Shekhar Sharma +91-8197243810 On Fri, Aug 30, 2013 at 6:25 AM, Adeel Qureshi adeelmahm...@gmail.com wrote: okay so when i specify the number of reducers

Re: metric type

2013-08-30 Thread Jitendra Yadav
Hi, Below link contains the answer for your question. http://hadoop.apache.org/docs/r1.2.0/api/org/apache/hadoop/metrics2/package-summary.html Regards Jitendra On Fri, Aug 30, 2013 at 11:35 AM, lei liu liulei...@gmail.com wrote: I use the metrics v2, there are COUNTER and GAUGE metric type in

答复: Hadoop HA error JOURNAL is not supported in state standby

2013-08-30 Thread Francis . Hu
Did you start up your ZKFC service on both of your name nodes ? Thanks, Francis.Hu -邮件原件- 发件人: orahad bigdata [mailto:oracle...@gmail.com] 发送时间: Friday, August 30, 2013 4:09 收件人: user 主题: Hadoop HA error JOURNAL is not supported in state standby Hi, I'm facing an error while starting

Re: Hadoop HA error JOURNAL is not supported in state standby

2013-08-30 Thread Jitendra Yadav
Hi, Totally agreed with Jing's reply, I faced the same issue previously, At that time I was doing cluster upgrade. However I upgraded all the nodes but in one of my node hdfs bin pointing to previous version, So I changed the PATH and it works fine for me. Thanks On Fri, Aug 30, 2013 at 2:10

Re: metric type

2013-08-30 Thread lei liu
Hi Jitendra, If I want to statistics number of bytes read per second,and display the result into ganglia, should I use MutableCounterLong or MutableGaugeLong? If I want to display current xceiver thread number in datanode into ganglia, should I use MutableCounterLong or MutableGaugeLong?

Re: [yarn] job is not getting assigned

2013-08-30 Thread Andre Kelpe
Hi Vinod, I found the issue: The yarn.nodemanager.resource.memory-mb value was to low. I set it back to the default value and the job runs fine now. Thanks! - André On Thu, Aug 29, 2013 at 7:36 PM, Vinod Kumar Vavilapalli vino...@apache.org wrote: This usually means there are no available

Re: Multidata center support

2013-08-30 Thread Adam Muise
Nothing has changed. DR best practice is still one (or more) clusters per site and replication is handled via distributed copy or some variation of it. A cluster spanning multiple data centers is a poor idea right now. On Fri, Aug 30, 2013 at 12:35 AM, Rahul Bhattacharjee

Re: authentication when uploading in to hadoop HDFS

2013-08-30 Thread Visioner Sadak
Thanks a ton Nitin just wanted to confirm for the point below an external user wont be able to write in to our cluster using any API right as we didnt included his ip in our cluster using password less ssh for him i guess ssh will prompt a password for writes and reads correct me if i am wrong :)

Re: authentication when uploading in to hadoop HDFS

2013-08-30 Thread Nitin Pawar
ssh has nothing to do with hdfs. there are three ways someone would want to write into hdfs 1) HDFS java api 2) hadoop command line tools 3) Webhdfs (doing post, put etc) In all above cases, there is no role of ssh. So you can assume that as long as no one has access to ssh-keys, no one can get

Re: authentication when uploading in to hadoop HDFS

2013-08-30 Thread Larry McCay
Hi Visioner - Depending on your actual installation, you may have all of the other APIs available to the CLI clients as well. This would potentially be an valid usecase for Apache Knox - in the incubator still - see: http://knox.incubator.apache.org/ Knox provides you with a Web API Gateway for

Re: secondary sort - number of reducers

2013-08-30 Thread Adeel Qureshi
yup it was negative and by doing this now it seems to be working fine On Fri, Aug 30, 2013 at 3:09 AM, Shekhar Sharma shekhar2...@gmail.comwrote: Is the hash code of that key is negative.? Do something like this return groupKey.hashCode() Integer.MAX_VALUE % numParts; Regards, Som

Re: Is hadoop tread safe?

2013-08-30 Thread Dinkar Sitaram
This comment (from http://stackoverflow.com/questions/12504690/how-to-run-hadoop-multithread-way-in-single-jvm ) may also be relevant: Hadoop purposely does not run more than one task at the same time in one JVM for isolation purposes. And in stand-alone (local) mode, only one JVM is ever used.

Re: secondary sort - number of reducers

2013-08-30 Thread Adeel Qureshi
my secondary sort on multiple keys seem to work fine with smaller data sets but with bigger data sets (like 256 gig and 800M+ records) the mapper phase gets done pretty quick (about 15 mins) but then the reducer phase seem to take forever. I am using 255 reducers. basic idea is that my composite

RE: secondary sort - number of reducers

2013-08-30 Thread java8964 java8964
Well, The reducers normally will take much longer than the mappers stage, because the copy/shuffle/sort all happened at this time, and they are the hard part. But before we simply say it is part of life, you need to dig into more of your MR jobs to find out if you can make it faster. You are

bad interpreter: Text file busy and other errors in Hadoop 2.1.0-beta

2013-08-30 Thread Jian Fang
Hi, I upgraded to Hadoop 2.1.0-beta and suddenly I started to see error messages as follows. Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: bash:

WritableComparable.compareTo vs RawComparator.compareTo

2013-08-30 Thread Adeel Qureshi
For secondary sort I am implementing a RawComparator and providing that as sortComparator .. is that the faster way or using a WritableComparable as mapper output and defining a compareTo method on the key itself also what happens if both are defined, is one ignored

hadoop 2.0.5 datanode heartbeat issue

2013-08-30 Thread orahad bigdata
Hi All, I'm using Hadoop 2.0.5 HA with QJM, After starting the cluster I did some manual switch overs between NN.Then after I opened WEBUI page for both the NN, I saw some strange situation where my DN connected to standby NN but not sending the heartbeat to primary NameNode . please guide.

Re: hadoop 2.0.5 datanode heartbeat issue

2013-08-30 Thread Jing Zhao
You may need to make sure the configuration of your DN has also been updated for HA. If your DN's configuration still uses the old URL (e.g., one of your NN's host+port) for fs.defaultFS, DN will only connect to that NN. On Fri, Aug 30, 2013 at 10:56 AM, orahad bigdata oracle...@gmail.com wrote:

Re: hadoop 2.0.5 datanode heartbeat issue

2013-08-30 Thread orahad bigdata
Thanks Jing, I'm using same configuration files at datanode side. dfs.nameservices - orahadoop (hdfs-site.xml) fs.defaultFS - hdfs://orahadoop (core-site.xml) Thanks On 8/30/13, Jing Zhao j...@hortonworks.com wrote: You may need to make sure the configuration of your DN has also been updated

Re: hadoop 2.0.5 datanode heartbeat issue

2013-08-30 Thread orahad bigdata
here is my conf files. ---core-site.xml--- configuration property namefs.defaultFS/name valuehdfs://orahadoop/value /property property namedfs.journalnode.edits.dir/name value/u0/journal/node/local/data/value /property /configuration

Re: reduce job hung in pending state: No room for reduce task

2013-08-30 Thread Jitendra Yadav
Hi, Did you checked the free disk space on server where your reducer task was running? because it need approx. 264gb free disk space to run(as per logs). Thanks Jitendra On 8/30/13, Jim Colestock j...@ramblingredneck.com wrote: Hello All, We're running into the following 2 bugs again:

Re: hadoop 2.0.5 datanode heartbeat issue

2013-08-30 Thread orahad bigdata
Thanks Jitendra, I have restarted my DataNode and suddenly it works for me :) now it's connected to both NN's. Do you know why this issue occurred? Thanks On Sat, Aug 31, 2013 at 1:24 AM, Jitendra Yadav jeetuyadav200...@gmail.comwrote: Hi, However your conf looks fine but I would say

Re: hadoop 2.0.5 datanode heartbeat issue

2013-08-30 Thread Jitendra Yadav
Hi, However your conf looks fine but I would say that you should restart your DN once and check your NN weburl. Regards Jitendra On 8/31/13, orahad bigdata oracle...@gmail.com wrote: here is my conf files. ---core-site.xml--- configuration property namefs.defaultFS/name

InvalidProtocolBufferException while submitting crunch job to cluster

2013-08-30 Thread Narlin M
I am getting following exception while trying to submit a crunch pipeline job to a remote hadoop cluster: Exception in thread main java.lang.RuntimeException: Cannot create job output directory /tmp/crunch-324987940 at org.apache.crunch.impl.mr.MRPipeline.createTempDirectory(MRPipeline.java:344)

Re: InvalidProtocolBufferException while submitting crunch job to cluster

2013-08-30 Thread Narlin M
Looks like I was pointing to incorrect ports. After correcting the port numbers, conf.set(fs.defaultFS, hdfs://server_address:8020); conf.set(mapred.job.tracker, server_address:8021); I am now getting the following exception: 2880 [Thread-15] INFO

Re: 答复: Hadoop HA error JOURNAL is not supported in state standby

2013-08-30 Thread orahad bigdata
Hi, Many thanks to everyone. Now issue got resolve after changing client version. Regards On Fri, Aug 30, 2013 at 1:12 PM, Francis.Hu francis...@reachjunction.comwrote: Did you start up your ZKFC service on both of your name nodes ? Thanks, Francis.Hu -邮件原件- 发件人: orahad bigdata

Re: metric type

2013-08-30 Thread Jitendra Yadav
Hi, For IO/sec statistics I think MutableCounterLongRate and MutableCounterLong more useful than others and for xceiver thread number I'm not bit sure right now. Thanks Jiitendra On Fri, Aug 30, 2013 at 1:40 PM, lei liu liulei...@gmail.com wrote: Hi  Jitendra, If I want to statistics number

Job config before read fields

2013-08-30 Thread Adrian CAPDEFIER
Howdy, I apologise for the lack of code in this message, but the code is fairly convoluted and it would obscure my problem. That being said, I can put together some sample code if really needed. I am trying to pass some metadata between the map reduce steps. This metadata is read and generated

Re: Job config before read fields

2013-08-30 Thread Shahab Yunus
I think you have to override/extend the Comparator to achieve that, something like what is done in Secondary Sort? Regards, Shahab On Fri, Aug 30, 2013 at 9:01 PM, Adrian CAPDEFIER chivas314...@gmail.comwrote: Howdy, I apologise for the lack of code in this message, but the code is fairly

Re: Multidata center support

2013-08-30 Thread Jun Ping Du
Hi, Although you can set datacenter layer on your network topology, it is never enabled in hadoop as lacking of replica placement and task scheduling support. There are some work to add layers other than rack and node under HADOOP-8848 but may not suit for your case. Agree with Adam that a

Re: Job config before read fields

2013-08-30 Thread Adrian CAPDEFIER
But how would the comparator have access to the job config? On Sat, Aug 31, 2013 at 2:38 AM, Shahab Yunus shahab.yu...@gmail.comwrote: I think you have to override/extend the Comparator to achieve that, something like what is done in Secondary Sort? Regards, Shahab On Fri, Aug 30, 2013

Re: Job config before read fields

2013-08-30 Thread Shahab Yunus
What I meant was that you might have to split or redesign your logic or your usecase (which we don't know about)? Regards, Shahab On Fri, Aug 30, 2013 at 10:31 PM, Adrian CAPDEFIER chivas314...@gmail.comwrote: But how would the comparator have access to the job config? On Sat, Aug 31, 2013