The fsimage file size is 1658934155
2013/8/13 Harsh J ha...@cloudera.com
How large are your checkpointed fsimage files?
On Mon, Aug 12, 2013 at 3:42 PM, lei liu liulei...@gmail.com wrote:
When Standby Namenode is doing checkpoint, upload the image file to
Active
NameNode, the Active
Thanks harsh,
Appreciate your input, as always.
On Aug 12, 2013, at 20:01, Harsh J ha...@cloudera.com wrote:
If you're not already doing it, run a local name caching daemon (such
as ncsd, etc.) on each cluster node. Hadoop does a lot of lookups and
a local cache would do good in reducing the
Sorry for not giving version details
I am using Hadoop version - 1.1.2 and Hbase version - 0.94.7
On Tue, Aug 13, 2013 at 1:53 PM, Vimal Jain vkj...@gmail.com wrote:
Hi,
I have configured Hadoop and Hbase in pseudo distributed mode.
So far things were working fine , but suddenly i started
Along with these exceptions i am seeing some exceptions in hbase logs too.
Here it is :
*Exception in Master log :*
2013-07-31 15:51:04,694 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
1266874891120ms instead of 1ms, this is likely due to a long garbage c
ollecting pause and it's
Hi,
One of your DN is marked as dead because NN is not able to get heartbeat
message from DN but NN still getting block information from dead node. This
error is similar to a bug *HDFS-1250* reported 2 years back and fixed in
0.20 release.
Can you please check the status of DN's in cluster.
Hi Jitendra,
Thanks for your reply.
Currently my hadoop/hbase is down in production as it had filled up the
disk space with above exceptions in log files and had to be brought down.
Also i am using hadoop/hbase in pseudo distributed mode , so there is only
one node which hosts all 6 processes (
I write one programm to test NameNode performance. Please see the
EditLogPerformance.java
I use 60 threads to execute the EditLogPerformance.javacode, the testing
result is below content:
2013-08-13 17:43:01,479 INFO my.EditLogPerformance
(EditLogPerformance.java:run(37)) - totalCount:10392810
Perhaps turning on fsimage compression may help. See
documentation of dfs.image.compress at
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml.
You can also try to throttle the bandwidth it uses via
dfs.image.transfer.bandwidthPerSec.
On Tue, Aug 13, 2013 at
Hi,
I'm currently using maven to build the jars necessary for my
map-reduce program to run and it works for a single node cluster..
For a multi node cluster, how do i specify my map-reduce program to
ingest the cluster settings instead of localhost settings?
I don't know how to specify this using
Hi,
I'm agreed with Harsh comment on image file compression and transfer
bandwidth parameter for optimizing checkpoint process.
In addition I'm not able to correlate your performance program log
timings(less then 10) and file transfer logs timing on active/stand by
nodes.
Thanks
On Tue, Aug 13,
You need to configure your namenode and jobtracker information in the
configuration files within you application. Only set the relevant
properties in the copy of the files that you are bundling in your job. For
rest the default values would be used from the default configuration files
Hi Pavan,
Configuration properties generally aren't included in the jar itself unless you
explicitly set them in your java code. Rather they're picked up from the
mapred-site.xml file located in the Hadoop configuration directory on the host
you're running your job from.
Is there an issue
Hi,
As Jitendra pointed out , this issue was fixed in .20 version.
I am using Hadoop 1.1.2 so why its occurring again ?
Please help here.
On Tue, Aug 13, 2013 at 2:56 PM, Vimal Jain vkj...@gmail.com wrote:
Hi Jitendra,
Thanks for your reply.
Currently my hadoop/hbase is down in production as
Hi Shabab and Sandy,
The thing is we have a 6 node cloudera cluster running.. For
development purposes, i was building a map-reduce application on a
single node apache distribution hadoop with maven..
To be frank, i don't know how to deploy this application on a multi
node cloudera cluster. I am
I was able to execute the example by running the job as the yarn user.
For example the following successfully completes:
sudo -u yarn yarn org.apache.hadoop.examples.RandomWriter /tmp/random-out
Whereas this fails with the local user rpaulk:
yarn org.apache.hadoop.examples.RandomWriter
When i actually run the job on the multi node cluster, logs shows it
uses localhost configurations which i don't want..
I just have a pom.xml which lists all the dependencies like standard
hadoop, standard hbase, standard zookeeper etc., Should i remove these
dependencies?
I want the cluster
Hi,
My application has a group of processes that need to communicate with
each other either through Shared Memory or TCP/IP depending on where the
containers are allocated, on the same machine or on different machines.
Obviously I would like them to get them allocated on the same node
whenever
I've been stuck on the same question lately so don't take this as definitive,
just my best guess at what's required.
Using maven as your hadoop source is going to give you a vanilla hadoop; one
that runs on localhost. You need one that you've customized to point to your
remote cluster and you
Nothing in your pom.xml should affect the configurations your job runs with.
Are you running your job from a node on the cluster? When you say localhost
configurations, do you mean it's using the LocalJobRunner?
-sandy
(iphnoe tpying)
On Aug 13, 2013, at 9:07 AM, Pavan Sudheendra
Yes Sandy, I'm referring to LocalJobRunner. I'm actually running the
job on one datanode..
What changes should i make so that my application would take advantage
of the cluster as a whole?
On Tue, Aug 13, 2013 at 10:33 PM, sandy.r...@cloudera.com wrote:
Nothing in your pom.xml should affect
You should not use LocalJobRunner. Make sure that the mapred.job.tracker
property does not point to 'local' an instead to your job-tracker host and
port.
*But before that* as Sandy said, your client machine (from where you will
be kicking of your jobs and apps) should be using config files which
That link got my hopes up. But Cloudera Manager (what I'm running; on CDH4)
does not offer an Export Client Config option. What am I missing?
On Aug 13, 2013, at 4:04 PM, Shahab Yunus shahab.yu...@gmail.com wrote:
You should not use LocalJobRunner. Make sure that the mapred.job.tracker
In our Clouder 4.2.0 cluster, Iog-in with *admin* user (do you have
appropriate permissions by the way?) Then I click on any one of the 3
services (hbase, mapred, hdfs and excluding zookeeper) from the top-leftish
menu. Then for each of these I can click the *Configuration* tab which is
in the
Folks, can you please take this thread to CDH related mailing list?
On Tue, Aug 13, 2013 at 3:07 PM, Brad Cox bradj...@gmail.com wrote:
That link got my hopes up. But Cloudera Manager (what I'm running; on
CDH4) does not offer an Export Client Config option. What am I missing?
On Aug 13,
Hi everyone I recently updated my cluster to 1.2.1 and now the
percentage of compeleted map tasks while the job is running keeps changing:
13/08/13 16:53:01 INFO mapred.JobClient: Running job: job_201308131452_0007
13/08/13 16:53:02 INFO mapred.JobClient: map 0% reduce 0%
13/08/13 16:53:19
Hi,
I have to run some analytics on the files present in HDFS using a MATLAB code.
I am thinking of compiling the MATLAB code into a C++ library and calling it in
map reduce code. How can I implement this? I read Hadoop streaming or Hadoop
pipes can be used for this. But I have not tried it on
I am working on a MapReduce job where I would like to have the output
sorted by a LongWritable value. I read the Anatomy of a MapReduce Run in
the Definitive Guide and it didn't say explicitly whether reduce() gets
called only once per map output key. If it does get called only once I was
thinking
27 matches
Mail list logo