Re: 3 machine cluster trouble
Hi Pat - The setting for hadoop.tmp.dir is used both locally and on HDFS and therefore should be consistent across your cluster. http://stackoverflow.com/questions/2354525/what-should-be-hadoop-tmp-dir cheers, -James On Wed, May 23, 2012 at 3:44 PM, Pat Ferrel wrote: > I have a two machine cluster and am adding a new machine. The new node has > a different location for hadoop.tmp.dir than the other two nodes and > refuses to start the datanode when started in the cluster. When I change > the location pointed to by hadoop.tmp.dir to be the same on all machines it > starts up fine on all machines. > > Shouldn't I be able to have the master and slave1 set as: > > hadoop.tmp.dir > /app/hadoop/tmp > A base for other temporary directories. > > > And slave2 set as: > > hadoop.tmp.dir > /media/d2/app/hadoop/**tmp > A base for other temporary directories. > > > ??? Slave2 runs standalone in single node mode just fine. Using 0.20.205. >
Re: Balancer exiting immediately despite having work to do.
Hi Landy - Attachments are stripped from e-mails sent to the mailing list. Could you publish your logs on pastebin and forward the url? cheers, -James On Wed, Jan 4, 2012 at 10:03 AM, Bible, Landy wrote: > Hi all, > > ** ** > > I’m running Hadoop 0.20.2. The balancer has suddenly stopped working. > I’m attempting to balance the cluster with a threshold of 1, using the > following command: > > ** ** > > ./hadoop balancer –threshold 1 > > ** ** > > This has been working fine, but suddenly it isn’t. It skips though 5 > iterations without actually doing any work: > > ** ** > > Time Stamp Iteration# Bytes Already Moved Bytes Left To > Move Bytes Being Moved > > Jan 4, 2012 11:47:56 AM 0 0 KB 1.87 > GB6.68 GB > > Jan 4, 2012 11:47:56 AM 1 0 KB 1.87 > GB6.68 GB > > Jan 4, 2012 11:47:56 AM 2 0 KB 1.87 > GB6.68 GB > > Jan 4, 2012 11:47:57 AM 3 0 KB 1.87 > GB6.68 GB > > Jan 4, 2012 11:47:57 AM 4 0 KB 1.87 > GB6.68 GB > > No block has been moved for 5 iterations. Exiting... > > Balancing took 524.0 milliseconds > > ** ** > > I’ve attached the full log, but I can’t see any errors indicating why it > is failing. Any ideas? I’d really like to get balancing working again. > My use case isn’t the norm, and it is important that the cluster stay as > close to completely balanced as possible. > > ** ** > > -- > > Landy Bible > > ** ** > > Simulation and Computer Specialist > > School of Nursing – Collins College of Business > > The University of Tulsa > > ** ** >
Re: Map Task Capacity Not Changing
(moving to mapreduce-user@, bcc'ing common-user@) Hi Joey - You'll want to change the value on all of your servers running tasktrackers and then restart each tasktracker to reread the configuration. cheers, -James On Thu, Dec 15, 2011 at 3:30 PM, Joey Krabacher wrote: > I have looked up how to up this value on the web and have tried all > suggestions to no avail. > > Any help would be great. > > Here is some background: > > Version: 0.20.2, r911707 > Compiled: Fri Feb 19 08:07:34 UTC 2010 by chrisdo > > Nodes: 5 > Current Map Task Capacity : 10 <--- this is what I want to increase. > > What I have tried : > > Adding > >mapred.tasktracker.map.tasks.maximum >8 >true > > to mapred-site.xml on NameNode. I also added this to one of the > datanodes for the hell of it and that didn't work either. > > Thanks. >
Re: Regarding pointers for LZO compression in Hive and Hadoop
Hi Abhishek - (Redirecting to user@hive, bcc'ing common-user) I found this blog to be particularly useful when incorporating Hive and LZO: http://www.mrbalky.com/2011/02/24/hive-tables-partitions-and-lzo-compression/ And if you're having issues setting up LZO with Hadoop in general, check out https://github.com/toddlipcon/hadoop-lzo cheers, -James On Wed, Dec 14, 2011 at 11:32 AM, Abhishek Pratap Singh wrote: > Hi, > > I m looking for some useful docs on enabling LZO on hadoop cluster. I tried > few of the blogs, but somehow its not working. > Here is my requirement. > > I have a hadoop 0.20.2 and Hive 0.6. I have some tables with 1.5 TB of > data, i want to compress them using LZO and enable LZO in hive as well as > in hadoop. > Let me know if you have any useful docs or pointers for the same. > > > Regards, > Abhishek >
Re: HDFS permission denied
At this point you should follow Mathias' advice - go to the logs and determine which path has the permission issue. It's better to change the settings for that path rather than disabling permissions (i.e. making everything 777) randomly. -jw On Mon, Apr 25, 2011 at 10:04 AM, Peng, Wei wrote: > James, > > Thanks for your replies. > In this case, how can I set up the permission correctly in order to run > a hadoop job? > Do I need to set hadoop tmp directory (which is in the local directory > instead of hdfs directory,right?) to be 777? > Since the person who maintain the hadoop cluster has left, I have no > idea what happened. =( > > Wei > > -Original Message- > From: jameswarr...@gmail.com [mailto:jameswarr...@gmail.com] On Behalf > Of James Warren > Sent: Monday, April 25, 2011 9:56 AM > To: common-user@hadoop.apache.org > Subject: Re: HDFS permission denied > > Hi Wei - > > In general, settings changes aren't applied until the hadoop daemons are > restarted. Sounds like someone enabled permissions previously, but they > didn't take hold until you rebooted your cluster. > > cheers, > -James > > On Mon, Apr 25, 2011 at 1:19 AM, Peng, Wei wrote: > > > I forgot to mention that the hadoop was running fine before. > > However, after it crashed last week, the restarted hadoop cluster has > > such permission issues. > > So that means the settings are still as same as before. > > Then what would be the cause? > > > > Wei > > > > -Original Message- > > From: James Seigel [mailto:ja...@tynt.com] > > Sent: Sunday, April 24, 2011 5:36 AM > > To: common-user@hadoop.apache.org > > Subject: Re: HDFS permission denied > > > > Check where the hadoop tmp setting is pointing to. > > > > James > > > > Sent from my mobile. Please excuse the typos. > > > > On 2011-04-24, at 12:41 AM, "Peng, Wei" wrote: > > > > > Hi, > > > > > > > > > > > > I need a help very bad. > > > > > > > > > > > > I got an HDFS permission error by starting to run hadoop job > > > > > > org.apache.hadoop.security.AccessControlException: Permission > denied: > > > > > > user=wp, access=WRITE, inode="":hadoop:supergroup:rwxr-xr-x > > > > > > > > > > > > I have the right permission to read and write files to my own hadoop > > > user directory. > > > > > > It works fine when I use hadoop fs -put. The job input and output > are > > > all from my own hadoop user directory. > > > > > > > > > > > > It seems that when a job starts running, some data need to be > written > > > into some directory, and I don't have the permission to that > > directory. > > > It is strange that the inode does not show which directory it is. > > > > > > > > > > > > Why does hadoop write something to a directory with my name > secretly? > > Do > > > I need to be set a particular user group? > > > > > > > > > > > > Many Thanks.. > > > > > > > > > > > > Vivian > > > > > > > > > > > > > > > > > >
Re: HDFS permission denied
Hi Wei - In general, settings changes aren't applied until the hadoop daemons are restarted. Sounds like someone enabled permissions previously, but they didn't take hold until you rebooted your cluster. cheers, -James On Mon, Apr 25, 2011 at 1:19 AM, Peng, Wei wrote: > I forgot to mention that the hadoop was running fine before. > However, after it crashed last week, the restarted hadoop cluster has > such permission issues. > So that means the settings are still as same as before. > Then what would be the cause? > > Wei > > -Original Message- > From: James Seigel [mailto:ja...@tynt.com] > Sent: Sunday, April 24, 2011 5:36 AM > To: common-user@hadoop.apache.org > Subject: Re: HDFS permission denied > > Check where the hadoop tmp setting is pointing to. > > James > > Sent from my mobile. Please excuse the typos. > > On 2011-04-24, at 12:41 AM, "Peng, Wei" wrote: > > > Hi, > > > > > > > > I need a help very bad. > > > > > > > > I got an HDFS permission error by starting to run hadoop job > > > > org.apache.hadoop.security.AccessControlException: Permission denied: > > > > user=wp, access=WRITE, inode="":hadoop:supergroup:rwxr-xr-x > > > > > > > > I have the right permission to read and write files to my own hadoop > > user directory. > > > > It works fine when I use hadoop fs -put. The job input and output are > > all from my own hadoop user directory. > > > > > > > > It seems that when a job starts running, some data need to be written > > into some directory, and I don't have the permission to that > directory. > > It is strange that the inode does not show which directory it is. > > > > > > > > Why does hadoop write something to a directory with my name secretly? > Do > > I need to be set a particular user group? > > > > > > > > Many Thanks.. > > > > > > > > Vivian > > > > > > > > > > >
Re: urgent, error: java.io.IOException: Cannot create directory
Hi Richard - First thing that comes to mind is a permissions issue. Can you verify that your directories along the desired namenode path are writable by the appropriate user(s)? HTH, -James On Wed, Dec 8, 2010 at 1:37 PM, Richard Zhang wrote: > Hi Guys: > I am just installation the hadoop 0.21.0 in a single node cluster. > I encounter the following error when I run bin/hadoop namenode -format > > 10/12/08 16:27:22 ERROR namenode.NameNode: > java.io.IOException: Cannot create directory > /your/path/to/hadoop/tmp/dir/hadoop-hadoop/dfs/name/current >at > > org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:312) >at > org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1425) >at > org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1444) >at > org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1242) >at > > org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1348) >at > org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1368) > > > Below is my core-site.xml > > > > > hadoop.tmp.dir > /your/path/to/hadoop/tmp/dir/hadoop-${user.name} > A base for other temporary directories. > > > > fs.default.name > hdfs://localhost:54310 > The name of the default file system. A URI whose > scheme and authority determine the FileSystem implementation. The > uri's scheme determines the config property (fs.SCHEME.impl) naming > the FileSystem implementation class. The uri's authority is used to > determine the host, port, etc. for a filesystem. > > > > > Below is my hdfs-site.xml > * > > > > > > > > dfs.replication > 1 > Default block replication. > The actual number of replications can be specified when the file is > created. > The default is used if replication is not specified in create time. > > > > > > > below is my mapred-site.xml: > > > > > > > > > > mapred.job.tracker > localhost:54311 > The host and port that the MapReduce job tracker runs > at. If "local", then jobs are run in-process as a single map > and reduce task. > > > > > > > Thanks. > Richard > * >
Re: Multiple masters in hadoop
Actually the /hadoop/conf/masters file is for configuring secondarynamenode(s). Check http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/ for details. cheers, -jw On Wed, Sep 29, 2010 at 1:36 PM, Shi Yu wrote: > The "Master" appeared in Masters and Salves files is the machine name or ip > address. If you have a single cluster, when you specify multiple names in > those files it will cause error because of the connection failure. > > Shi > > > On 2010-9-29 15:28, Bhushan Mahale wrote: > >> Hi, >> >> The master files name in hadoop/conf is called as masters. >> Wondering if I can configure multiple masters for a single cluster. If >> yes, how can I use them? >> >> Thanks, >> Bhushan >> >> >> DISCLAIMER >> == >> This e-mail may contain privileged and confidential information which is >> the property of Persistent Systems Ltd. It is intended only for the use of >> the individual or entity to which it is addressed. If you are not the >> intended recipient, you are not authorized to read, retain, copy, print, >> distribute or use this message. If you have received this communication in >> error, please notify the sender and delete all copies of this message. >> Persistent Systems Ltd. does not accept any liability for virus infected >> mails. >> >> >> > > > -- > Postdoctoral Scholar > Institute for Genomics and Systems Biology > Department of Medicine, the University of Chicago > Knapp Center for Biomedical Discovery > 900 E. 57th St. Room 10148 > Chicago, IL 60637, US > Tel: 773-702-6799 > >
lengthy delay after the last reduce completes
I was just wondering what goes on under the covers once the last reduce task ends. The following is from a very simple map reduce I run throughout the day. Typically the run time is about a minute from start to end, but for this particular run there was a delay of over 5 minutes after the last reduce task ended. Any thoughts? Thanks, -James Warren 2010-05-07 01:11:10,302 [main] INFO org.apache.hadoop.mapred.JobClient - Running job: job_201005041742_0879 2010-05-07 01:11:11,305 [main] INFO org.apache.hadoop.mapred.JobClient - map 0% reduce 0% 2010-05-07 01:11:49,410 [main] INFO org.apache.hadoop.mapred.JobClient - map 4% reduce 0% 2010-05-07 01:11:55,427 [main] INFO org.apache.hadoop.mapred.JobClient - map 8% reduce 0% 2010-05-07 01:12:04,454 [main] INFO org.apache.hadoop.mapred.JobClient - map 17% reduce 0% 2010-05-07 01:12:07,462 [main] INFO org.apache.hadoop.mapred.JobClient - map 17% reduce 2% 2010-05-07 01:12:10,471 [main] INFO org.apache.hadoop.mapred.JobClient - map 26% reduce 2% 2010-05-07 01:12:16,487 [main] INFO org.apache.hadoop.mapred.JobClient - map 43% reduce 5% 2010-05-07 01:12:19,497 [main] INFO org.apache.hadoop.mapred.JobClient - map 100% reduce 5% 2010-05-07 01:12:22,505 [main] INFO org.apache.hadoop.mapred.JobClient - map 100% reduce 14% 2010-05-07 01:12:31,530 [main] INFO org.apache.hadoop.mapred.JobClient - map 100% reduce 100% 2010-05-07 01:18:06,367 [main] INFO org.apache.hadoop.mapred.JobClient - Job complete: job_201005041742_0879
Re: fair scheduler preemptions timeout difficulties
Todd from Cloudera solved this for me on their company's forum. "What you're missing is the "mapred.fairscheduler.preemption" property in mapred-site.xml - without this on, the preemption settings in the allocations file are ignored... to turn it on, set that property's value to 'true'" Thanks, Todd! On Wed, Dec 2, 2009 at 4:26 PM, james warren wrote: > Greetings, Hadoop Fans: > > I'm attempting to use the timeout feature of the Fair Scheduler (using > Cloudera's most recently released distribution 0.20.1+152-1), but without > success. I'm using the following configs: > > /etc/hadoop/conf/mapred-site.xml > > > > > > > mapred.job.tracker > hadoop-master:8021 > > > mapred.tasktracker.map.tasks.maximum > 9 > > > mapred.tasktracker.reduce.tasks.maximum > 3 > > > mapred.jobtracker.taskScheduler > org.apache.hadoop.mapred.FairScheduler > > > mapred.fairscheduler.allocation.file > /etc/hadoop/conf/pools.xml > > > mapred.fairscheduler.assignmultiple > true > > > mapred.fairscheduler.poolnameproperty > pool.name > > > pool.name > default > > > > > and /etc/hadoop/conf/pools.xml > > > > > 4 > 1 > 180 > 2.0 > > > 2 > 2 > 1 > > > > but a job in the realtime pool fails to interrupt a job running in the > default queue (waited for > 15 minutes). Is there something wrong with my > configs? Or is there anything in the logs that would be useful for > debugging? (I've only found a "successfully configured fairscheduler" > comment in the jobtracker log upon starting up the daemon.) > > Help would be extremely appreciated! > > Thanks, > -James Warren > >
fair scheduler preemptions timeout difficulties
Greetings, Hadoop Fans: I'm attempting to use the timeout feature of the Fair Scheduler (using Cloudera's most recently released distribution 0.20.1+152-1), but without success. I'm using the following configs: /etc/hadoop/conf/mapred-site.xml mapred.job.tracker hadoop-master:8021 mapred.tasktracker.map.tasks.maximum 9 mapred.tasktracker.reduce.tasks.maximum 3 mapred.jobtracker.taskScheduler org.apache.hadoop.mapred.FairScheduler mapred.fairscheduler.allocation.file /etc/hadoop/conf/pools.xml mapred.fairscheduler.assignmultiple true mapred.fairscheduler.poolnameproperty pool.name pool.name default and /etc/hadoop/conf/pools.xml 4 1 180 2.0 2 2 1 but a job in the realtime pool fails to interrupt a job running in the default queue (waited for > 15 minutes). Is there something wrong with my configs? Or is there anything in the logs that would be useful for debugging? (I've only found a "successfully configured fairscheduler" comment in the jobtracker log upon starting up the daemon.) Help would be extremely appreciated! Thanks, -James Warren
detecting stalled daemons?
Quick question for the hadoop / linux masters out there: I recently observed a stalled tasktracker daemon on our production cluster, and was wondering if there were common tests to detect failures so that administration tools (e.g. monit) can automatically restart the daemon. The particular observed symptoms were: - the node was dropped by the jobtracker - information in /proc listed the tasktracker process as sleeping, not zombie - the web interface (port 50060) was unresponsive, though telnet did connect - no error information in the hadoop logs -- they simply were no longer being updated I certainly cannot be the first person to encounter this - anyone have a neat and tidy solution they could share? (And yes, we will eventually we go down the nagios / ganglia / cloudera desktop path but we're waiting until we're running CDH2.) Many thanks, -James Warren