Re: 3 machine cluster trouble

2012-05-23 Thread James Warren
Hi Pat -

The setting for hadoop.tmp.dir is used both locally and on HDFS and
therefore should be consistent across your cluster.

http://stackoverflow.com/questions/2354525/what-should-be-hadoop-tmp-dir

cheers,
-James

On Wed, May 23, 2012 at 3:44 PM, Pat Ferrel  wrote:

> I have a two machine cluster and am adding a new machine. The new node has
> a different location for hadoop.tmp.dir than the other two nodes and
> refuses to start the datanode when started in the cluster. When I change
> the location pointed to by hadoop.tmp.dir to be the same on all machines it
> starts up fine on all machines.
>
> Shouldn't I be able to have the master and slave1 set as:
> 
> hadoop.tmp.dir
> /app/hadoop/tmp
> A base for other temporary directories.
> 
>
> And slave2 set as:
> 
> hadoop.tmp.dir
> /media/d2/app/hadoop/**tmp
> A base for other temporary directories.
> 
>
> ??? Slave2 runs standalone in single node mode just fine. Using 0.20.205.
>


Re: Balancer exiting immediately despite having work to do.

2012-01-04 Thread James Warren
Hi Landy -

Attachments are stripped from e-mails sent to the mailing list.  Could you
publish your logs on pastebin and forward the url?

cheers,
-James

On Wed, Jan 4, 2012 at 10:03 AM, Bible, Landy wrote:

> Hi all,
>
> ** **
>
> I’m running Hadoop 0.20.2.  The balancer has suddenly stopped working.
> I’m attempting to balance the cluster with a threshold of 1, using the
> following command:
>
> ** **
>
> ./hadoop balancer –threshold 1
>
> ** **
>
> This has been working fine, but suddenly it isn’t.  It skips though 5
> iterations without actually doing any work:
>
> ** **
>
> Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To
> Move  Bytes Being Moved
>
> Jan 4, 2012 11:47:56 AM   0 0 KB 1.87
> GB6.68 GB
>
> Jan 4, 2012 11:47:56 AM   1 0 KB 1.87
> GB6.68 GB
>
> Jan 4, 2012 11:47:56 AM   2 0 KB 1.87
> GB6.68 GB
>
> Jan 4, 2012 11:47:57 AM   3 0 KB 1.87
> GB6.68 GB
>
> Jan 4, 2012 11:47:57 AM   4 0 KB 1.87
> GB6.68 GB
>
> No block has been moved for 5 iterations. Exiting...
>
> Balancing took 524.0 milliseconds
>
> ** **
>
> I’ve attached the full log, but I can’t see any errors indicating why it
> is failing.  Any ideas?  I’d really like to get balancing working again.
> My use case isn’t the norm, and it is important that the cluster stay as
> close to completely balanced as possible.
>
> ** **
>
> --
>
> Landy Bible
>
> ** **
>
> Simulation and Computer Specialist
>
> School of Nursing – Collins College of Business
>
> The University of Tulsa
>
> ** **
>


Re: Map Task Capacity Not Changing

2011-12-15 Thread James Warren
(moving to mapreduce-user@, bcc'ing common-user@)

Hi Joey -

You'll want to change the value on all of your servers running tasktrackers
and then restart each tasktracker to reread the configuration.

cheers,
-James

On Thu, Dec 15, 2011 at 3:30 PM, Joey Krabacher wrote:

> I have looked up how to up this value on the web and have tried all
> suggestions to no avail.
>
> Any help would be great.
>
> Here is some background:
>
> Version: 0.20.2, r911707
> Compiled: Fri Feb 19 08:07:34 UTC 2010 by chrisdo
>
> Nodes: 5
> Current Map Task Capacity : 10  <--- this is what I want to increase.
>
> What I have tried :
>
> Adding
>   
>mapred.tasktracker.map.tasks.maximum
>8
>true
>  
> to mapred-site.xml on NameNode.  I also added this to one of the
> datanodes for the hell of it and that didn't work either.
>
> Thanks.
>


Re: Regarding pointers for LZO compression in Hive and Hadoop

2011-12-14 Thread James Warren
Hi Abhishek -

(Redirecting to user@hive, bcc'ing common-user)

I found this blog to be particularly useful when incorporating Hive and LZO:

http://www.mrbalky.com/2011/02/24/hive-tables-partitions-and-lzo-compression/

And if you're having issues setting up LZO with Hadoop in general, check out

https://github.com/toddlipcon/hadoop-lzo

cheers,
-James



On Wed, Dec 14, 2011 at 11:32 AM, Abhishek Pratap Singh  wrote:

> Hi,
>
> I m looking for some useful docs on enabling LZO on hadoop cluster. I tried
> few of the blogs, but somehow its not working.
> Here is my requirement.
>
> I have a hadoop 0.20.2 and Hive 0.6. I have some tables with 1.5 TB of
> data, i want to compress them using LZO and enable LZO in hive as well as
> in hadoop.
> Let me know if you have any useful docs or pointers for the same.
>
>
> Regards,
> Abhishek
>


Re: HDFS permission denied

2011-04-25 Thread James Warren
At this point you should follow Mathias' advice - go to the logs and
determine which path has the permission issue.  It's better to change the
settings for that path rather than disabling permissions (i.e. making
everything 777) randomly.

-jw

On Mon, Apr 25, 2011 at 10:04 AM, Peng, Wei  wrote:

> James,
>
> Thanks for your replies.
> In this case, how can I set up the permission correctly in order to run
> a hadoop job?
> Do I need to set hadoop tmp directory (which is in the local directory
> instead of hdfs directory,right?) to be 777?
> Since the person who maintain the hadoop cluster has left, I have no
> idea what happened. =(
>
> Wei
>
> -Original Message-
> From: jameswarr...@gmail.com [mailto:jameswarr...@gmail.com] On Behalf
> Of James Warren
> Sent: Monday, April 25, 2011 9:56 AM
> To: common-user@hadoop.apache.org
> Subject: Re: HDFS permission denied
>
> Hi Wei -
>
> In general, settings changes aren't applied until the hadoop daemons are
> restarted.  Sounds like someone enabled permissions previously, but they
> didn't take hold until you rebooted your cluster.
>
> cheers,
> -James
>
> On Mon, Apr 25, 2011 at 1:19 AM, Peng, Wei  wrote:
>
> > I forgot to mention that the hadoop was running fine before.
> > However, after it crashed last week, the restarted hadoop cluster has
> > such permission issues.
> > So that means the settings are still as same as before.
> > Then what would be the cause?
> >
> > Wei
> >
> > -Original Message-
> > From: James Seigel [mailto:ja...@tynt.com]
> > Sent: Sunday, April 24, 2011 5:36 AM
> > To: common-user@hadoop.apache.org
> > Subject: Re: HDFS permission denied
> >
> > Check where the hadoop tmp setting is pointing to.
> >
> > James
> >
> > Sent from my mobile. Please excuse the typos.
> >
> > On 2011-04-24, at 12:41 AM, "Peng, Wei"  wrote:
> >
> > > Hi,
> > >
> > >
> > >
> > > I need a help very bad.
> > >
> > >
> > >
> > > I got an HDFS permission error by starting to run hadoop job
> > >
> > > org.apache.hadoop.security.AccessControlException: Permission
> denied:
> > >
> > > user=wp, access=WRITE, inode="":hadoop:supergroup:rwxr-xr-x
> > >
> > >
> > >
> > > I have the right permission to read and write files to my own hadoop
> > > user directory.
> > >
> > > It works fine when I use hadoop fs -put. The job input and output
> are
> > > all from my own hadoop user directory.
> > >
> > >
> > >
> > > It seems that when a job starts running, some data need to be
> written
> > > into some directory, and I don't have the permission to that
> > directory.
> > > It is strange that the inode does not show which directory it is.
> > >
> > >
> > >
> > > Why does hadoop write something to a directory with my name
> secretly?
> > Do
> > > I need to be set a particular user group?
> > >
> > >
> > >
> > > Many Thanks..
> > >
> > >
> > >
> > > Vivian
> > >
> > >
> > >
> > >
> > >
> >
>


Re: HDFS permission denied

2011-04-25 Thread James Warren
Hi Wei -

In general, settings changes aren't applied until the hadoop daemons are
restarted.  Sounds like someone enabled permissions previously, but they
didn't take hold until you rebooted your cluster.

cheers,
-James

On Mon, Apr 25, 2011 at 1:19 AM, Peng, Wei  wrote:

> I forgot to mention that the hadoop was running fine before.
> However, after it crashed last week, the restarted hadoop cluster has
> such permission issues.
> So that means the settings are still as same as before.
> Then what would be the cause?
>
> Wei
>
> -Original Message-
> From: James Seigel [mailto:ja...@tynt.com]
> Sent: Sunday, April 24, 2011 5:36 AM
> To: common-user@hadoop.apache.org
> Subject: Re: HDFS permission denied
>
> Check where the hadoop tmp setting is pointing to.
>
> James
>
> Sent from my mobile. Please excuse the typos.
>
> On 2011-04-24, at 12:41 AM, "Peng, Wei"  wrote:
>
> > Hi,
> >
> >
> >
> > I need a help very bad.
> >
> >
> >
> > I got an HDFS permission error by starting to run hadoop job
> >
> > org.apache.hadoop.security.AccessControlException: Permission denied:
> >
> > user=wp, access=WRITE, inode="":hadoop:supergroup:rwxr-xr-x
> >
> >
> >
> > I have the right permission to read and write files to my own hadoop
> > user directory.
> >
> > It works fine when I use hadoop fs -put. The job input and output are
> > all from my own hadoop user directory.
> >
> >
> >
> > It seems that when a job starts running, some data need to be written
> > into some directory, and I don't have the permission to that
> directory.
> > It is strange that the inode does not show which directory it is.
> >
> >
> >
> > Why does hadoop write something to a directory with my name secretly?
> Do
> > I need to be set a particular user group?
> >
> >
> >
> > Many Thanks..
> >
> >
> >
> > Vivian
> >
> >
> >
> >
> >
>


Re: urgent, error: java.io.IOException: Cannot create directory

2010-12-08 Thread james warren
Hi Richard -

First thing that comes to mind is a permissions issue.  Can you verify that
your directories along the desired namenode path are writable by the
appropriate user(s)?

HTH,
-James

On Wed, Dec 8, 2010 at 1:37 PM, Richard Zhang wrote:

> Hi Guys:
> I am just installation the hadoop 0.21.0 in a single node cluster.
> I encounter the following error when I run bin/hadoop namenode -format
>
> 10/12/08 16:27:22 ERROR namenode.NameNode:
> java.io.IOException: Cannot create directory
> /your/path/to/hadoop/tmp/dir/hadoop-hadoop/dfs/name/current
>at
>
> org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.clearDirectory(Storage.java:312)
>at
> org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1425)
>at
> org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:1444)
>at
> org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:1242)
>at
>
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1348)
>at
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1368)
>
>
> Below is my core-site.xml
>
> 
> 
> 
>  hadoop.tmp.dir
>  /your/path/to/hadoop/tmp/dir/hadoop-${user.name}
>  A base for other temporary directories.
> 
>
> 
>  fs.default.name
>  hdfs://localhost:54310
>  The name of the default file system.  A URI whose
>  scheme and authority determine the FileSystem implementation.  The
>  uri's scheme determines the config property (fs.SCHEME.impl) naming
>  the FileSystem implementation class.  The uri's authority is used to
>  determine the host, port, etc. for a filesystem.
> 
> 
>
>
> Below is my hdfs-site.xml
> *
> 
>
> 
>
> 
> 
> 
>  dfs.replication
>  1
>  Default block replication.
>  The actual number of replications can be specified when the file is
> created.
>  The default is used if replication is not specified in create time.
>  
> 
>
> 
>
>
> below is my mapred-site.xml:
> 
> 
>
> 
>
> 
>
> 
> 
>  mapred.job.tracker
>  localhost:54311
>  The host and port that the MapReduce job tracker runs
>  at.  If "local", then jobs are run in-process as a single map
>  and reduce task.
>  
> 
>
> 
>
>
> Thanks.
> Richard
> *
>


Re: Multiple masters in hadoop

2010-09-29 Thread james warren
Actually the /hadoop/conf/masters file is for configuring
secondarynamenode(s).  Check
http://www.cloudera.com/blog/2009/02/multi-host-secondarynamenode-configuration/
for
details.

cheers,
-jw

On Wed, Sep 29, 2010 at 1:36 PM, Shi Yu  wrote:

> The "Master" appeared in Masters and Salves files is the machine name or ip
> address.  If you have a single cluster, when you specify multiple names in
> those files it will cause error because of the connection failure.
>
> Shi
>
>
> On 2010-9-29 15:28, Bhushan Mahale wrote:
>
>> Hi,
>>
>> The master files name in hadoop/conf is called as masters.
>> Wondering if I can configure multiple masters for a single cluster. If
>> yes, how can I use them?
>>
>> Thanks,
>> Bhushan
>>
>>
>> DISCLAIMER
>> ==
>> This e-mail may contain privileged and confidential information which is
>> the property of Persistent Systems Ltd. It is intended only for the use of
>> the individual or entity to which it is addressed. If you are not the
>> intended recipient, you are not authorized to read, retain, copy, print,
>> distribute or use this message. If you have received this communication in
>> error, please notify the sender and delete all copies of this message.
>> Persistent Systems Ltd. does not accept any liability for virus infected
>> mails.
>>
>>
>>
>
>
> --
> Postdoctoral Scholar
> Institute for Genomics and Systems Biology
> Department of Medicine, the University of Chicago
> Knapp Center for Biomedical Discovery
> 900 E. 57th St. Room 10148
> Chicago, IL 60637, US
> Tel: 773-702-6799
>
>


lengthy delay after the last reduce completes

2010-05-07 Thread james warren
I was just wondering what goes on under the covers once the last reduce task
ends.  The following is from a very simple map reduce I run throughout the
day.  Typically the run time is about a minute from start to end, but for
this particular run there was a delay of over 5 minutes after the last
reduce task ended.

Any thoughts?

Thanks,
-James Warren



2010-05-07 01:11:10,302 [main] INFO  org.apache.hadoop.mapred.JobClient  -
Running job: job_201005041742_0879
2010-05-07 01:11:11,305 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 0% reduce 0%
2010-05-07 01:11:49,410 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 4% reduce 0%
2010-05-07 01:11:55,427 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 8% reduce 0%
2010-05-07 01:12:04,454 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 17% reduce 0%
2010-05-07 01:12:07,462 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 17% reduce 2%
2010-05-07 01:12:10,471 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 26% reduce 2%
2010-05-07 01:12:16,487 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 43% reduce 5%
2010-05-07 01:12:19,497 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 100% reduce 5%
2010-05-07 01:12:22,505 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 100% reduce 14%
2010-05-07 01:12:31,530 [main] INFO  org.apache.hadoop.mapred.JobClient  -
 map 100% reduce 100%
2010-05-07 01:18:06,367 [main] INFO  org.apache.hadoop.mapred.JobClient  -
Job complete: job_201005041742_0879


Re: fair scheduler preemptions timeout difficulties

2009-12-02 Thread james warren
Todd from Cloudera solved this for me on their company's forum.

"What you're missing is the "mapred.fairscheduler.preemption" property in
mapred-site.xml - without this on, the preemption settings in the
allocations file are ignored... to turn it on, set that property's value to
'true'"

Thanks, Todd!

On Wed, Dec 2, 2009 at 4:26 PM, james warren  wrote:

> Greetings, Hadoop Fans:
>
> I'm attempting to use the timeout feature of the Fair Scheduler (using
> Cloudera's most recently released distribution 0.20.1+152-1), but without
> success.  I'm using the following configs:
>
> /etc/hadoop/conf/mapred-site.xml
>
> 
> 
>
> 
>   
> mapred.job.tracker
> hadoop-master:8021
>   
>   
>  mapred.tasktracker.map.tasks.maximum
>  9
>   
>   
>  mapred.tasktracker.reduce.tasks.maximum
>  3
>   
>   
>  mapred.jobtracker.taskScheduler
>  org.apache.hadoop.mapred.FairScheduler
>   
>   
>  mapred.fairscheduler.allocation.file
>  /etc/hadoop/conf/pools.xml
>   
>   
>  mapred.fairscheduler.assignmultiple
>  true
>   
>   
>  mapred.fairscheduler.poolnameproperty
>  pool.name
>   
>   
>  pool.name
>  default
>   
>
> 
>
> and /etc/hadoop/conf/pools.xml
>
> 
> 
>   
> 4
> 1
> 180
> 2.0
>   
>   
> 2
> 2
> 1
>   
> 
>
> but a job in the realtime pool fails to interrupt a job running in the
> default queue (waited for > 15 minutes).  Is there something wrong with my
> configs?  Or is there anything in the logs that would be useful for
> debugging?  (I've only found a "successfully configured fairscheduler"
> comment in the jobtracker log upon starting up the daemon.)
>
> Help would be extremely appreciated!
>
> Thanks,
> -James Warren
>
>


fair scheduler preemptions timeout difficulties

2009-12-02 Thread james warren
Greetings, Hadoop Fans:

I'm attempting to use the timeout feature of the Fair Scheduler (using
Cloudera's most recently released distribution 0.20.1+152-1), but without
success.  I'm using the following configs:

/etc/hadoop/conf/mapred-site.xml





  
mapred.job.tracker
hadoop-master:8021
  
  
 mapred.tasktracker.map.tasks.maximum
 9
  
  
 mapred.tasktracker.reduce.tasks.maximum
 3
  
  
 mapred.jobtracker.taskScheduler
 org.apache.hadoop.mapred.FairScheduler
  
  
 mapred.fairscheduler.allocation.file
 /etc/hadoop/conf/pools.xml
  
  
 mapred.fairscheduler.assignmultiple
 true
  
  
 mapred.fairscheduler.poolnameproperty
 pool.name
  
  
 pool.name
 default
  



and /etc/hadoop/conf/pools.xml



  
4
1
180
2.0
  
  
2
2
1
  


but a job in the realtime pool fails to interrupt a job running in the
default queue (waited for > 15 minutes).  Is there something wrong with my
configs?  Or is there anything in the logs that would be useful for
debugging?  (I've only found a "successfully configured fairscheduler"
comment in the jobtracker log upon starting up the daemon.)

Help would be extremely appreciated!

Thanks,
-James Warren


detecting stalled daemons?

2009-10-07 Thread james warren
Quick question for the hadoop / linux masters out there:

I recently observed a stalled tasktracker daemon on our production cluster,
and was wondering if there were common tests to detect failures so that
administration tools (e.g. monit) can automatically restart the daemon.  The
particular observed symptoms were:

   - the node was dropped by the jobtracker
   - information in /proc listed the tasktracker process as sleeping, not
   zombie
   - the web interface (port 50060) was unresponsive, though telnet did
   connect
   - no error information in the hadoop logs -- they simply were no longer
   being updated

I certainly cannot be the first person to encounter this - anyone have a
neat and tidy solution they could share?

(And yes, we will eventually we go down the nagios / ganglia / cloudera
desktop path but we're waiting until we're running CDH2.)

Many thanks,
-James Warren