Hi,
I am looking to know, how to protect the Hadoop Web UIs running on
ports 50030, 50070 with password including HMASTER/60010?
--
Thanks,
Shah
This process of managing looks like more pain long term. Would it be
easier to store in Hbase which has smaller block size?
What's the avg. file size?
On Sun, Oct 2, 2011 at 7:34 PM, Vitthal Suhas Gogate
gog...@hortonworks.com wrote:
Agree with Bejoy, although to minimize the processing latency
Sending it to the hadoop mailing list - I think this is a hadoop related
problem and not related to Cloudera distribution.
Raj
- Forwarded Message -
From: Raj V rajv...@yahoo.com
To: CDH Users cdh-u...@cloudera.org
Sent: Friday, September 30, 2011 5:21 PM
Subject: pointing
Are you sure you have chown'd/chmod'd the ramdisk directory to be
writeable by your hadoop user? I have played with this in the past
and it should basically work.
On Oct 3, 2011, at 10:37 AM, Raj V wrote:
Sending it to the hadoop mailing list - I think this is a hadoop
related problem
Eric
Yes. The owner is hdfs and group is hadoop and the directory is group
writable(775). This is tehe exact same configuration I have when I use real
disks.But let me give it a try again to see if I overlooked something.
Thanks
Raj
From: Eric Caspole
Must be related to some kind of permissions problems.
It will help if you can paste the corresponding source code for
FileUtil.copy(). Hard to track it with different versions, so.
Thanks,
+Vinod
On Mon, Oct 3, 2011 at 9:28 PM, Raj V rajv...@yahoo.com wrote:
Eric
Yes. The owner is hdfs and
hi, Ryan
i'm trying to recover from disk full error on the namenode as well. i can fire
up namenode after printf \xff\xff\xff\xee\xff /var/name/current/edits
but now it's stuck in safe mode verifying blocks for hours... is there a way to
check progress on that?
or is there a way to speed that
There is two method is there for processing OLTP
1. Hstremming or scibe these are only methodes
2. if not use chukuwa for storing the data so that when i you got a
tesent volume then you can move to HDFS
Thanks and Regards,
S SYED ABDUL KATHER
Hi,
My understanding is that when the reduce() method is called, the values
(IterableVALUEIN values) are stored in memory.
1/ Is that actually true ?
2/ If this is true, is there a way to lazy-load the inputs to use less
memory ? (e.g. load all the items by batches of 20, and discard the
I have given HBase a fair amount of thought, and I am looking for
input. Instead of managing incremental loads myself, why not just
setup an HBase cluster ? What are some of the trade offs.
My primary use for this cluster would still be data
analysis/aggregation and not so much random access.
This directory can get very large, in many cases I doubt it would fit on a
ram disk.
Also RAM Disks tend to help most with random read/write, since hadoop is
doing mostly linear IO you may not see a great benefit from the RAM disk.
On Mon, Oct 3, 2011 at 12:07 PM, Vinod Kumar Vavilapalli
Vinod
Carefully checked everything again. The permissions are 775 and the owner is
hdfs:hadoop. The task tracker creates a directory called toBeDeleted under
/ramdisk so things do not seem to be permssion related. The task tracker
starts happily if I don't mount the ramdisk and leave
Edward
I understand the size limitations - but for my experiment the ramdisk size I
have created is large enough.
I think there will be substantial benefits by putting the intermediate map
outputs on a ramdisk - size permitting, ofcourse, but I can't provide any
numbers to substantiate my
Raj,
I just tried this on my CHD3u1 VM, and the ramdisk worked the first
time. So, it's possible you've hit a bug in CDH3b3 that was later
fixed. Can you enable debug logging in log4j.properties and then
repost your task tracker log? I think there might be more details that
it will print that
Just to make sure I was clear-enough :
- Is there a parameter that allows to set the size of the batch of elements
that are retrieved to memory while the reduce task iterates on the input
values ?
Thanks,
Sami dalouche
On Mon, Oct 3, 2011 at 1:42 PM, Sami Dalouche sa...@hopper.com wrote:
Hi,
Joey
Thanks. Will try and uppgrade to a newer version and check. I will also change
the logs to debug and see if more information is available.
Raj
From: Joey Echeverria j...@cloudera.com
To: common-user@hadoop.apache.org; Raj V rajv...@yahoo.com
Sent:
Hi Hadoopers,
I am writing script to detect if there is any running job has been running
longer than X hours.
So far, I use
./hadoop job -jt jobtracker:port -list all |awk '{ if($2==1) print $1 }'
--- to get the list of running JobID
I am finding the way to get how long the job has been
Harsh thanks,
It worked out now I am able to start the cluster but when I tried to see the
jobConf history
http://localhost:50030/jobconf_history.jsp
I got the following message.
Missing 'logFile' for fetching job configuration!
On Sep 30, 2011, at 5:41 PM, Harsh J wrote:
Since you're only
I am not sure there is a easy way to get what you want on command line.. one
option is to use following command which would give you verbose job history
where you can find submit, Launch Finish time (including duration on
FinishTime line). I am using hadoop-0.20.205.0 branch. So check if you
Steps worked in the following document worked for me, except
-- JAVA_HOME need to be set correctly in the conf/hadoop-env.sh
-- By default on Mac OS X, sshd is not running, so need to start it using
System Preferences/Sharing and add users who are allowed to do ssh.
Sorry few more things,
-- localhost did not work for me.. I had to use my machine name returned by
hostname e.g. horton-mac.local
-- Also change the localhost to your machine name in conf/slaves and
conf/masters file.
--Suhas
On Mon, Oct 3, 2011 at 5:44 PM, Vitthal Suhas Gogate
Hi,
I have a question regarding the performance and column value size.
I need to store per row several million integers. (Several million is
important here)
I was wondering which method would be more beneficial performance wise.
1) Store each integer to a single column so that when a row is
22 matches
Mail list logo