-- Forwarded message --
From: Vikas Jadhav vikascjadha...@gmail.com
Date: Tue, Jan 22, 2013 at 5:23 PM
Subject: Bulk Loading DFS Space issue in Hbase
To: u...@hbase.apache.org
Hi
I am trying to bulk load 700m CSV data with 31 colms into Hbase
I have written MapReduce Program for
Hi,
I was tuning mapred job to reduce number of spills and reached a stage where
following numbers are same -
Spilled Records in map = Spilled records in reduce = Combine output Records =
Reduce Input Records
I do not see any lines in mapper logs with following strings -
1. Spilling map
Hi!
When I run job with this options:
-Dmapred.map.child.java.opts=-Xmx2048M
-Dio.sort.mb=1424
-Dio.sort.record.percent=0.08
all tasks fails on combiner step with:
...
2013-01-23 12:20:28,143 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 1424
2013-01-23 12:23:03,772 INFO
Iterating on Bharath's responses, my answers to each of your questions
inline:
On Wed, Jan 23, 2013 at 2:54 PM, Dibyendu Karmakar
dibyendu.d...@gmail.comwrote:
Hi,
I am doing some performance testing in HADOOP. But while testing, I faced
a situation. I need your help.
My HADOOP cluster :
Can you add in what exactly is your combiner logic performing?
On Wed, Jan 23, 2013 at 3:09 PM, s2323 s2...@land.ru wrote:
Hi!
When I run job with this options:
-Dmapred.map.child.java.opts=-Xmx2048M
-Dio.sort.mb=1424
-Dio.sort.record.percent=0.08
all tasks fails on combiner step
Can somebody answer me on this plz ?
On Wed, Jan 23, 2013 at 11:44 AM, Mohit Vadhera
project.linux.p...@gmail.com wrote:
Thanks Guys, As you said the level is already pretty low i.e 100 MB but in
my case the root fs / has 14 G available. What can be the root cause then ?
Mohit,
When do you specifically get the error at the NN? Does your NN consistently
not start with that error?
Your local disk space availability can certainly fluctuate if you use the
same disk for MR and other activity which creates temporary files.
On Wed, Jan 23, 2013 at 9:01 PM, Mohit
hi, everyone!
I want use the nutch to crawl the web pages, but problem comes as the log
like, I think it maybe some permissions problem,but i am not sure.
Any help will be appreciated, think you
2013-01-23 07:37:21,809 ERROR mapred.FileOutputCommitter - Mkdirs failed to
create
NN switches randomly into the safemode then I run command to leave safemode
manually. I never got alerts for low disk space on machine level and i
didn't see the space fluctuates GBs into MBs .
On Wed, Jan 23, 2013 at 9:10 PM, Harsh J ha...@cloudera.com wrote:
Mohit,
When do you
A random switching behavior can only be explained by a fluctuating disk
space I'd think. Are you running MR operations on the same disk (i.e. is it
part of mapred.local.dir as well)?
On Wed, Jan 23, 2013 at 9:24 PM, Mohit Vadhera project.linux.p...@gmail.com
wrote:
NN switches randomly into
What version of Hadoop are you using, and is your use of the local
(non-cluster) job runner mode intentional?
On Wed, Jan 23, 2013 at 9:23 PM, 吴靖 qhwj2...@126.com wrote:
hi, everyone!
I want use the nutch to crawl the web pages, but problem comes as the
log like, I think it maybe some
MR operation are running on the same machine. i checked the parameter
mapred.local.dir in my installed directory /etc/hadoop/ but didn't find .
One question the disk space reserved size displayed in logs in KB or MB ?
I am layman on hadoop. The link I followed to install is given below
The logs display it in simple bytes. If the issue begins to occur when you
start using Hadoop, then its most certainly MR using up the disk space
temporarily.
You could lower the threshold, or you could perhaps use a bigger disk for
your trials/more nodes.
On Wed, Jan 23, 2013 at 10:25 PM,
On Wed, Jan 23, 2013 at 10:37 PM, Mohit Vadhera
project.linux.p...@gmail.com wrote:
51200
51200 *bytes* is 50 KB. 50 MB is 50*1024*1024, which is 52428800. You can
verify changes to config by visiting the http://NNHOST:50070/conf page and
searching for the config key name to see if the NN has
Also, pls stop the cc. Tx.
On Jan 23, 2013, at 9:06 AM, Harsh J wrote:
Again, moving to cdh-u...@cloudera.org. Please use the
https://groups.google.com/a/cloudera.org/forum/?fromgroups=#!forum/cdh-user
forum for CDH-specific issues as this list is for help with Apache Hadoop
Question
We're trying out Cloudera Manager and CDH4 in a Clustered deployment and having
trouble getting the Task Trackers to start up.
The error says (full stacktrace below)
2013-01-23 10:48:37,443 ERROR org.apache.hadoop.mapred.TaskTracker: Can not
start task tracker because
This is the problem:
drwx-- 4 hdfs hdfs 4096 Jan 15 16:37 ..
Your /data/1 directory seems to be owned by hdfs and restricted only
to it (700). I'm not sure this is necessary and you can perhaps make
it 755 at least.
Or perhaps what you may have is a misconfig wherein you've set your DN
Hi,
We've recently built a hadoop package that we're somewhat happy with
which we'd now like to deploy on Amazon's EC2. However we built against
the HADOOP release 1.1.0 and there doesn't appear to be a public AMI
image for hadoop 1.1.0. Will we have to build our own AMI or is there
another
Hi,
We've recently built a hadoop package that we're somewhat happy with
which we'd now like to deploy on Amazon's EC2. However we built against
the HADOOP release 1.1.0 and there doesn't appear to be a public AMI
image for hadoop 1.1.0. Will we have to build our own AMI or is there
another
Pardon if my use of AMZN jargon is wrong here cause I don't quite use
it much: I don't think we carry/maintain an AMI. However, there's the
Apache Whirr project that deals with Hadoop over Cloud and you can
probably take a look/ask there: http://whirr.apache.org?
On Wed, Jan 23, 2013 at 11:55 PM,
My apologies for sending this message to this group, but I'm having trouble
sending to the right group.
From: Steven Wong
Sent: Wednesday, January 23, 2013 11:15 AM
To: impala-u...@cloudera.org
Subject: RE: Need help with cluster setup for performance
Thanks
Hello Radim,
Your solution sounds interesting. Is it possible for me to try the solution
before I buy it?
Thnx ,
Regards
On Wed, Jan 23, 2013 at 1:07 AM, Radim Kolar h...@filez.com wrote:
i have solution integrating spring beans and spring batch directly into
hadoop core. its far more
Hi,
I am getting this error when I run the java application that uses HDFS API to
transfer files to HDFS remotely. This used to work fine with CDH3 and now we
are using CDH4.
Exception in thread main java.io.IOException: No FileSystem for scheme: hdfs
at
my hadoop version is hadoop-1.1.1, and it run in the local mode!
At 2013-01-24 00:43:53,Harsh J ha...@cloudera.com wrote:
What version of Hadoop are you using, and is your use of the local
(non-cluster) job runner mode intentional?
On Wed, Jan 23, 2013 at 9:23 PM, 吴靖 qhwj2...@126.com
hi all,
I found hdfs du periodicity(one hour), and because my disk is big, the
smallest one is 15T, so when hdfs exec du, datanode will not respond for
about 3 minuts because of io loading, this cause a lot of problem, anybody
knows why hdfs doing this and how to disable it?
--
Thanks Regards
Hi Barak,
As instructed on
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html,
do you also make sure to call the mos.close() function at the end of
Mapper (in its cleanup stage)?
On Thu, Jan 24, 2013 at 12:40 PM, Barak Yaish barak.ya...@gmail.com
Yes, I'm calling mos.close() at the Mapper.cleanup(). Are there some logs
that I can turn on to troubleshoot this issue?
On Thu, Jan 24, 2013 at 9:36 AM, Harsh J ha...@cloudera.com wrote:
Hi Barak,
As instructed on
Hi,
HDFS does this to estimate space reports. Perhaps the discussion here
may help you: http://search-hadoop.com/m/LLBgUiH0Bg2
On Thu, Jan 24, 2013 at 12:51 PM, Xibin Liu xibin.liu...@gmail.com wrote:
hi all,
I found hdfs du periodicity(one hour), and because my disk is big, the
smallest one
Thanks, http://search-hadoop.com/m/LLBgUiH0Bg2 is my issue , but I still
dont't know how to solve this problem, 3 minutes not respond once an hour
is a big problem for me, any clue for this?
2013/1/24 Harsh J ha...@cloudera.com
Hi,
HDFS does this to estimate space reports. Perhaps the
29 matches
Mail list logo