Protecting NN JT UI with password

2011-10-03 Thread Shahnawaz Saifi
Hi, I am looking to know, how to protect the Hadoop Web UIs running on ports 50030, 50070 with password including HMASTER/60010? -- Thanks, Shah

Re: incremental loads into hadoop

2011-10-03 Thread Mohit Anchlia
This process of managing looks like more pain long term. Would it be easier to store in Hbase which has smaller block size? What's the avg. file size? On Sun, Oct 2, 2011 at 7:34 PM, Vitthal Suhas Gogate gog...@hortonworks.com wrote: Agree with Bejoy, although to minimize the processing latency

Fw: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Raj V
Sending it to the hadoop mailing list - I think this is a hadoop related problem and not related to Cloudera distribution. Raj - Forwarded Message - From: Raj V rajv...@yahoo.com To: CDH Users cdh-u...@cloudera.org Sent: Friday, September 30, 2011 5:21 PM Subject: pointing

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Eric Caspole
Are you sure you have chown'd/chmod'd the ramdisk directory to be writeable by your hadoop user? I have played with this in the past and it should basically work. On Oct 3, 2011, at 10:37 AM, Raj V wrote: Sending it to the hadoop mailing list - I think this is a hadoop related problem

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Raj V
Eric Yes. The owner is hdfs and group is hadoop and the directory is group writable(775).  This is tehe exact same configuration I have when I use real disks.But let me give it a try again to see if I overlooked something. Thanks Raj From: Eric Caspole

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Vinod Kumar Vavilapalli
Must be related to some kind of permissions problems. It will help if you can paste the corresponding source code for FileUtil.copy(). Hard to track it with different versions, so. Thanks, +Vinod On Mon, Oct 3, 2011 at 9:28 PM, Raj V rajv...@yahoo.com wrote: Eric Yes. The owner is hdfs and

Re: Help - can't start namenode after disk full error

2011-10-03 Thread Shouguo Li
hi, Ryan i'm trying to recover from disk full error on the namenode as well. i can fire up namenode after printf \xff\xff\xff\xee\xff /var/name/current/edits but now it's stuck in safe mode verifying blocks for hours... is there a way to check progress on that? or is there a way to speed that

Re: incremental loads into hadoop

2011-10-03 Thread in.abdul
There is two method is there for processing OLTP 1. Hstremming or scibe these are only methodes 2. if not use chukuwa for storing the data so that when i you got a tesent volume then you can move to HDFS Thanks and Regards, S SYED ABDUL KATHER

lazy-loading of Reduce's input

2011-10-03 Thread Sami Dalouche
Hi, My understanding is that when the reduce() method is called, the values (IterableVALUEIN values) are stored in memory. 1/ Is that actually true ? 2/ If this is true, is there a way to lazy-load the inputs to use less memory ? (e.g. load all the items by batches of 20, and discard the

Re: incremental loads into hadoop

2011-10-03 Thread Sam Seigal
I have given HBase a fair amount of thought, and I am looking for input. Instead of managing incremental loads myself, why not just setup an HBase cluster ? What are some of the trade offs. My primary use for this cluster would still be data analysis/aggregation and not so much random access.

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Edward Capriolo
This directory can get very large, in many cases I doubt it would fit on a ram disk. Also RAM Disks tend to help most with random read/write, since hadoop is doing mostly linear IO you may not see a great benefit from the RAM disk. On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Raj V
Vinod Carefully checked everything again. The permissions are 775 and the owner is hdfs:hadoop.  The task tracker creates a directory called toBeDeleted under /ramdisk so things do not seem to be permssion related.  The task tracker starts happily if I don't mount the ramdisk and leave

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Raj V
Edward I understand the size limitations - but for my experiment the ramdisk size I have created is large enough.  I think there will be substantial benefits by putting the intermediate map outputs on a ramdisk - size permitting, ofcourse, but I can't provide any numbers to substantiate my

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Joey Echeverria
Raj, I just tried this on my CHD3u1 VM, and the ramdisk worked the first time. So, it's possible you've hit a bug in CDH3b3 that was later fixed. Can you enable debug logging in log4j.properties and then repost your task tracker log? I think there might be more details that it will print that

Re: lazy-loading of Reduce's input

2011-10-03 Thread Sami Dalouche
Just to make sure I was clear-enough : - Is there a parameter that allows to set the size of the batch of elements that are retrieved to memory while the reduce task iterates on the input values ? Thanks, Sami dalouche On Mon, Oct 3, 2011 at 1:42 PM, Sami Dalouche sa...@hopper.com wrote: Hi,

Re: pointing mapred.local.dir to a ramdisk

2011-10-03 Thread Raj V
Joey Thanks. Will try and uppgrade to a newer version and check. I will also change the logs to debug and see if more information is available. Raj From: Joey Echeverria j...@cloudera.com To: common-user@hadoop.apache.org; Raj V rajv...@yahoo.com Sent:

Monitoring Slow job.

2011-10-03 Thread patrick sang
Hi Hadoopers, I am writing script to detect if there is any running job has been running longer than X hours. So far, I use ./hadoop job -jt jobtracker:port -list all |awk '{ if($2==1) print $1 }' --- to get the list of running JobID I am finding the way to get how long the job has been

Re: error for deploying hadoop on macbook pro

2011-10-03 Thread Jignesh Patel
Harsh thanks, It worked out now I am able to start the cluster but when I tried to see the jobConf history http://localhost:50030/jobconf_history.jsp I got the following message. Missing 'logFile' for fetching job configuration! On Sep 30, 2011, at 5:41 PM, Harsh J wrote: Since you're only

Re: Monitoring Slow job.

2011-10-03 Thread Vitthal Suhas Gogate
I am not sure there is a easy way to get what you want on command line.. one option is to use following command which would give you verbose job history where you can find submit, Launch Finish time (including duration on FinishTime line). I am using hadoop-0.20.205.0 branch. So check if you

Re: error for deploying hadoop on macbook pro

2011-10-03 Thread Vitthal Suhas Gogate
Steps worked in the following document worked for me, except -- JAVA_HOME need to be set correctly in the conf/hadoop-env.sh -- By default on Mac OS X, sshd is not running, so need to start it using System Preferences/Sharing and add users who are allowed to do ssh.

Re: error for deploying hadoop on macbook pro

2011-10-03 Thread Vitthal Suhas Gogate
Sorry few more things, -- localhost did not work for me.. I had to use my machine name returned by hostname e.g. horton-mac.local -- Also change the localhost to your machine name in conf/slaves and conf/masters file. --Suhas On Mon, Oct 3, 2011 at 5:44 PM, Vitthal Suhas Gogate

Adjusting column value size.

2011-10-03 Thread edward choi
Hi, I have a question regarding the performance and column value size. I need to store per row several million integers. (Several million is important here) I was wondering which method would be more beneficial performance wise. 1) Store each integer to a single column so that when a row is