Re: How to fail the Name Node or how to crash the Name Node for testing Purpose.

2013-06-18 Thread Pavan Kumar Polineni
I am using Hadoop-1. I dont want HA. On Wed, Jun 19, 2013 at 12:20 PM, Azuryy Yu wrote: > hey Pavan, > Hadoop-2.* has HDFS HA, which hadoop version are you using? > > > > > On Wed, Jun 19, 2013 at 2:46 PM, Pavan Kumar Polineni < > smartsunny...@gmail.com> wrote: > >> I am checking for Cloudera

Re: How to fail the Name Node or how to crash the Name Node for testing Purpose.

2013-06-18 Thread Azuryy Yu
hey Pavan, Hadoop-2.* has HDFS HA, which hadoop version are you using? On Wed, Jun 19, 2013 at 2:46 PM, Pavan Kumar Polineni < smartsunny...@gmail.com> wrote: > I am checking for Cloudera only. But no HA? just we have single Name node. > For testing purposes and preventing actions. Preparing e

Re: How to fail the Name Node or how to crash the Name Node for testing Purpose.

2013-06-18 Thread Pavan Kumar Polineni
I am checking for Cloudera only. But no HA? just we have single Name node. For testing purposes and preventing actions. Preparing expected scenarios and solutions for them. On Wed, Jun 19, 2013 at 12:14 PM, Nitin Pawar wrote: > are you testing it for HA? > which version of hadoop are you using?

Re: How to fail the Name Node or how to crash the Name Node for testing Purpose.

2013-06-18 Thread Nitin Pawar
are you testing it for HA? which version of hadoop are you using? can you explain your test scenario in detail On Wed, Jun 19, 2013 at 12:08 PM, Pavan Kumar Polineni < smartsunny...@gmail.com> wrote: > For Testing The Name Node Crashes and failures. For Single Point of Failure > > -- > Pavan K

Re: Unexpected end of input stream: how to locate related file(s)/

2013-06-18 Thread Nitin Pawar
I think they are using hive, can he look at "INPUT__FILE__NAME" in hive. I have never tried it, so please pardon me if I am wrong On Wed, Jun 19, 2013 at 11:58 AM, Arun C Murthy wrote: > Robin, > > On Jun 18, 2013, at 11:12 PM, Robin Verlangen wrote: > > Hi Arun, > > Thank you for your reply

Re: How to fail the Name Node or how to crash the Name Node for testing Purpose.

2013-06-18 Thread Azuryy Yu
or "kill -9 namenode_pid" to simulate NN crashed. On Wed, Jun 19, 2013 at 2:42 PM, Azuryy Yu wrote: > $HADOOP_HOME/bin/hadoop-daemon.sh stop namenode > > > > On Wed, Jun 19, 2013 at 2:38 PM, Pavan Kumar Polineni < > smartsunny...@gmail.com> wrote: > >> For Testing The Name Node Crashes and fail

Re: How to fail the Name Node or how to crash the Name Node for testing Purpose.

2013-06-18 Thread Azuryy Yu
$HADOOP_HOME/bin/hadoop-daemon.sh stop namenode On Wed, Jun 19, 2013 at 2:38 PM, Pavan Kumar Polineni < smartsunny...@gmail.com> wrote: > For Testing The Name Node Crashes and failures. For Single Point of Failure > > -- > Pavan Kumar Polineni >

How to fail the Name Node or how to crash the Name Node for testing Purpose.

2013-06-18 Thread Pavan Kumar Polineni
For Testing The Name Node Crashes and failures. For Single Point of Failure -- Pavan Kumar Polineni

Re: Unexpected end of input stream: how to locate related file(s)/

2013-06-18 Thread Arun C Murthy
Robin, On Jun 18, 2013, at 11:12 PM, Robin Verlangen wrote: > Hi Arun, > > Thank you for your reply. We run Hadoop 2.0.0 with MapReduce 0.20 packaged by > Cloudera. > > Do you know where to find the log files related to a specific task, is that > also in the folder /var/log/hadoop-0.20-mapre

Re: Unexpected end of input stream: how to locate related file(s)/

2013-06-18 Thread Robin Verlangen
Hi Arun, Thank you for your reply. We run Hadoop 2.0.0 with MapReduce 0.20 packaged by Cloudera. Do you know where to find the log files related to a specific task, is that also in the folder /var/log/hadoop-0.20-mapreduce/userlogs/job_ID/ Best regards, Robin Verlangen *Data Architect* * * W ht

Re: Unexpected end of input stream: how to locate related file(s)/

2013-06-18 Thread Arun C Murthy
What version of MapReduce are you using? At the beginning of the log-file you should be able to see a log msg with the input-split file name for the map. thanks, Arun On Jun 18, 2013, at 10:54 PM, Robin Verlangen wrote: > Hi there, > > How can I locate the files that cause these errors in my

Unexpected end of input stream: how to locate related file(s)/

2013-06-18 Thread Robin Verlangen
Hi there, How can I locate the files that cause these errors in my Map/Reduce jobs? java.io.IOException: java.io.EOFException: Unexpected end of input stream at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)

Re: How Yarn execute MRv1 job?

2013-06-18 Thread Arun C Murthy
Not true, the CapacityScheduler has support for both CPU & Memory now. On Jun 18, 2013, at 10:41 PM, Rahul Bhattacharjee wrote: > Hi Devaraj, > > As for the container request request for yarn container , currently only > memory is considered as resource , not cpu. Please correct. > > Thanks,

Re: How Yarn execute MRv1 job?

2013-06-18 Thread Rahul Bhattacharjee
by please correct , i meant - please correct me if my statement is wrong. On Wed, Jun 19, 2013 at 11:11 AM, Rahul Bhattacharjee < rahul.rec@gmail.com> wrote: > Hi Devaraj, > > As for the container request request for yarn container , currently only > memory is considered as resource , not c

Re: How Yarn execute MRv1 job?

2013-06-18 Thread Rahul Bhattacharjee
Hi Devaraj, As for the container request request for yarn container , currently only memory is considered as resource , not cpu. Please correct. Thanks, Rahul On Wed, Jun 19, 2013 at 11:05 AM, Devaraj k wrote: > Hi Sam, > > Please find the answers for your queries. > > > >- Yarn c

RE: How Yarn execute MRv1 job?

2013-06-18 Thread Devaraj k
Hi Sam, Please find the answers for your queries. >- Yarn could run multiple kinds of jobs(MR, MPI, ...), but, MRv1 job has >special execution process(map > shuffle > reduce) in Hadoop 1.x, and how Yarn >execute a MRv1 job? still include some special MR steps in Hadoop 1.x, like >map, sort, m

Mounting HDFS as Local File System using FUSE

2013-06-18 Thread Mohammad Mustaqeem
I want to mount the HDFS as local file system using FUSE but I don't know how to install fuse. I am using ubuntu 12.04. I found these instructions http://xmodulo.com/2012/06/how-to-mount-hdfs-using-fuse.html but when I run sudo apt-get install hadoop-0.20-fuse I got following error: Reading packag

Re: DFS Permissions on Hadoop 2.x

2013-06-18 Thread Harsh J
This is a HDFS bug. Like all other methods that check for permissions being enabled, the client call of setPermission should check it as well. It does not do that currently and I believe it should be a NOP in such a case. Please do file a JIRA (and reference the ID here to close the loop)! On Wed,

Re: Hadoop 1.0.3 join

2013-06-18 Thread Harsh J
Yes, it doesn't exist in the new API in 1.0.3. On Wed, Jun 19, 2013 at 6:45 AM, Ahmed Elgohary wrote: > Hello, > > I am using hadoop 1.0.3 and trying to join multiple input files using > CompositeInputFormat. It seems to me that I have to use the old api to write > the join job since the new api

How Yarn execute MRv1 job?

2013-06-18 Thread sam liu
Hi, 1.In Hadoop 1.x, a job will be executed by map task and reduce task together, with a typical process(map > shuffle > reduce). In Yarn, as I know, a MRv1 job will be executed only by ApplicationMaster. - Yarn could run multiple kinds of jobs(MR, MPI, ...), but, MRv1 job has special execution pr

Hadoop 1.0.3 join

2013-06-18 Thread Ahmed Elgohary
Hello, I am using hadoop 1.0.3 and trying to join multiple input files using CompositeInputFormat. It seems to me that I have to use the old api to write the join job since the new api does not support join in hadoop 1.0.3. Is that correct? thanks, --ahmed

Re: DFS Permissions on Hadoop 2.x

2013-06-18 Thread Prashant Kommireddi
Looks like the jobs fail only on the first attempt and pass thereafter. Failure occurs while setting perms on "intermediate done directory". Here is what I think is happening: 1. Intermediate done dir is (ideally) created as part of deployment (for eg, /mapred/history/done_intermediate) 2. When a

Re: DFS Permissions on Hadoop 2.x

2013-06-18 Thread Prashant Kommireddi
Hi Chris, This is while running a MR job. Please note the job is able to write files to "/mapred" directory and fails on EXECUTE permissions. On digging in some more, it looks like the failure occurs after writing to "/mapred/history/done_intermediate". Here is a more detailed stacktrace. INFO:

Re: Shuffle design: optimization tradeoffs

2013-06-18 Thread Bertrand Dechoux
On the academic side, you might be interested to read about *resilient distributed datasets (RDDs)* : http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf. Not exactly the same subject but it has the merit of pointing out that a solution is related to a context. Bertrand On Sat, Jun 15,

Re: Assignment of data splits to mappers

2013-06-18 Thread Bertrand Dechoux
1) The tradeoff is between reducing the overhead of distributed computing and reducing the cost of failure. Less tasks, less overhead but the cost of failure will be bigger, mainly because the distribution will be coarser. One of the reason was outlined before. A (failed) task is related to an inpu

Re: Debugging YARN AM

2013-06-18 Thread Alejandro Abdelnur
If distributed shell is running as an unmanaged AM then you should set the debug flags for the 'hadoop jar' invocation, doing an 'export HADOOP_OPTS=' with the debug flags would do. Thx On Tue, Jun 18, 2013 at 12:32 PM, Curtis Ullerich wrote: > Update: It looks like I could add the flag at

Re: DFS Permissions on Hadoop 2.x

2013-06-18 Thread Chris Nauroth
Prashant, can you provide more details about what you're doing when you see this error? Are you submitting a MapReduce job, running an HDFS shell command, or doing some other action? It's possible that we're also seeing an interaction with some other change in 2.x that triggers a setPermission ca

Re: How is the memory usage of containers controlled?

2013-06-18 Thread Arun C Murthy
NodeManagers monitor containers w.r.t memory usage, and put containers in cgroups with cpu limits to restrict CPU usage. On Jun 18, 2013, at 12:30 PM, Yuzhang Han wrote: > Hi, > I am curious about how YARN containers control their memory usage. Say, I > have a MR job, and I configure that ever

Re: Debugging YARN AM

2013-06-18 Thread Curtis Ullerich
Update: It looks like I could add the flag at line 515 of hadoop-yarn-applications-distributedshell/src/main/java/org/apache/hadoop/yarn/applications/distributedshell/Client.java (package org.apache.hadoop.yarn.applications.distributedshell). I tried this: vargs.add("-Xdebug -Xrunjdwp:transpor

How is the memory usage of containers controlled?

2013-06-18 Thread Yuzhang Han
Hi, I am curious about how YARN containers control their memory usage. Say, I have a MR job, and I configure that every map task should be assigned a 1 GB container, and every reduce task a 1.5 GB one. So, when YARN runs the containers, how is it ensured that all map containers use less than 1

RE: DFS Permissions on Hadoop 2.x

2013-06-18 Thread Leo Leung
I believe, the properties name should be "dfs.permissions" From: Prashant Kommireddi [mailto:prash1...@gmail.com] Sent: Tuesday, June 18, 2013 10:54 AM To: user@hadoop.apache.org Subject: DFS Permissions on Hadoop 2.x Hello, We just upgraded our cluster from 0.20.2 to 2.x (with HA) and had a qu

Re: DFS Permissions on Hadoop 2.x

2013-06-18 Thread Jean-Baptiste Onofré
It sounds like a change in the behavior. Regards JB On 06/18/2013 09:04 PM, Prashant Kommireddi wrote: Thanks for the reply, Chris. Yes, I am certain this worked with 0.20.2. It used a slightly different property and I have checked setting it to false actually disables checking for perms.

Re: DFS Permissions on Hadoop 2.x

2013-06-18 Thread Prashant Kommireddi
Thanks for the reply, Chris. Yes, I am certain this worked with 0.20.2. It used a slightly different property and I have checked setting it to false actually disables checking for perms. dfs.permissions false true On Tue, Jun 18, 2013 at 11:58 AM, Chris Nauroth wrote: > Hello Pr

Re: DFS Permissions on Hadoop 2.x

2013-06-18 Thread Chris Nauroth
Hello Prashant, Reviewing the code, it appears that the setPermission operation specifically is coded to always check ownership, even if dfs.permissions.enabled is set to false. From what I can tell, this behavior is the same in 0.20 too though. Are you certain that you weren't seeing this stack

Re: Namenode memory usage

2013-06-18 Thread Patai Sangbutsarakum
Thanks Brahma, I am kind of afraid to run the command, I had an issue on jobtracker early this year. I launched the command and it caused the jobtracker stop responding long enough till we need to roll the jobtracker instead. So i am kind of afraid to run it on the production namenode. Any suggest

DFS Permissions on Hadoop 2.x

2013-06-18 Thread Prashant Kommireddi
Hello, We just upgraded our cluster from 0.20.2 to 2.x (with HA) and had a question around disabling dfs permissions on the latter version. For some reason, setting the following config does not seem to work dfs.permissions.enabled false Any other configs that might be needed f

Re: hprof profiler output location

2013-06-18 Thread yypvsxf19870706
Hi Rahul I even search the files using find / - name attemp*.profile.but still nothing was found. Can you indicate the format of the file name. Thanks 发自我的 iPhone 在 2013-6-18,20:27,Rahul Bhattacharjee 写道: > In the same directory from which the job has been triggered. > > Thank

Re: Error in command: bin/hadoop fs -put conf input

2013-06-18 Thread Rahul Bhattacharjee
no data nodes in cluster. go to cluster web portal. Thanks, Rahul On Sun, Jun 16, 2013 at 2:38 AM, sumit piparsania wrote: > Hi, > > I am getting the below error while executing the command. Kindly assist me > in resolving this issue. > > > $ bin/hadoop fs -put conf input > bin/hadoop: line 3

Re: hprof profiler output location

2013-06-18 Thread Rahul Bhattacharjee
In the same directory from which the job has been triggered. Thanks, Rahul On Sun, Jun 16, 2013 at 3:33 PM, YouPeng Yang wrote: > > Hi All > > I want to profile a fraction of the tasks in a job,so I configured my > job as [1]. > However I could not get the hprof profiler output on the ho

Re: Why my tests shows Yarn is worse than MRv1 for terasort?

2013-06-18 Thread Michel Segel
Sam, I think your cluster is too small for any meaningful conclusions to be made. Sent from a remote device. Please excuse any typos... Mike Segel On Jun 18, 2013, at 3:58 AM, sam liu wrote: > Hi Harsh, > > Thanks for your detailed response! Now, the efficiency of my Yarn cluster > improved

Re: Why my tests shows Yarn is worse than MRv1 for terasort?

2013-06-18 Thread sam liu
Hi Harsh, Thanks for your detailed response! Now, the efficiency of my Yarn cluster improved a lot after increasing the reducer number(mapreduce.job.reduces) in mapred-site.xml. But I still have some questions about the way of Yarn to execute MRv1 job: 1.In Hadoop 1.x, a job will be executed by m