Hi Folks,
I am looking for some advice on some the ways / techniques
that people are using to get around namenode failures (Both disk and
host).
We have a small cluster with several job scheduled for periodic
execution on the same host where name server runs. What we would like to
Hello,
Is it possible to avoid replication caused by data node decommission?
I want to stop data nodes without moving or copying data blocks even
though they have smaller replication factor.
Goel, Ankur wrote:
Hi Folks,
I am looking for some advice on some the ways / techniques
that people are using to get around namenode failures (Both disk and
host).
We have a small cluster with several job scheduled for periodic
execution on the same host where name server runs.
Out of curiosity, how reliable are the counters from the perspective of the
JobClient while the job is in progress? While hitting 'refresh' on the
status web page for a job, I notice that my counters bounce all over the
place, showing wildly different figures second-to-second. Is that using a
I had a similar problem when I upgraded... not sure of details why, but
I had permissions problems trying to develop and run on windows out of
cygwin.
I found that in cygwin if I ran under my account, I got the null pointer
exception, but if I shh localhost first, then format the name node,
We have just realized one reason for the '/no live node contains block/'
error from /DFSClient/ is an indication that the /DFSClient/ was unable
to open a connection due to insufficient available file descriptors.
FsShell is particularly bad about consuming descriptors and leaving the
C G wrote:
I've got a grid which has been up and running for some time. It's been using a 32
bit JVM. I am hitting the wall on memory within NameNode and need to specify max
heap size 4G. Is it possible to switch seemlessly from 32bit JVM to 64bit?
I've tried this on a small test grid and
Goel, Ankur wrote:
Hi Folks,
I am looking for some advice on some the ways / techniques
that people are using to get around namenode failures (Both disk and
host).
We have a small cluster with several job scheduled for periodic
execution on the same host where name server runs.
Thanks for the replies folks. We are not seeing this frequently but we
just want to avoid single point of failure and keep the manual
intervention to the min. or at best none. This is to ensure that system
runs smoothly in production without abrupt failures.
Thanks
-Ankur
-Original
Thanks Arun for your tip.
This morning I changed to submitJob and polled. It worked very well,
and you saved me some trial and error.
-Original Message-
From: Aaron Kimball [mailto:[EMAIL PROTECTED]
Sent: Monday, November 10, 2008 4:35 AM
To: core-user@hadoop.apache.org
Subject:
Hello,
I am doing a task, whick will read dbRecord data from web service,
and then I will build index on them,But you see, inside the hadoop , The
inputFormat is based on the FileInputFormat, So now I have to rewrite my
dbRecordInputFormat , And I do it like this:
import
Between 0.15 and 0.18 the format for fs.default.name has changed; you should
set the value there as hdfs://localhost:9000/ without the quotes.
It still shouldn't give you a NPE (that should probably get a JIRA entry)
under any circumstances, but putting a value in the (new) proper format
might
Allen,
It sounds like you think the 64- and 32-bit environments are effectively
interchangable. May I ask why are you using both? The 64bit environment
gives you access to more memory; do you see faster performance for the TT's
in 32-bit mode? Do you get bit by library compatibility bugs that
Do you know about the jobtracker page? Visit http://yournamenode:50030.
This page (served by Jetty) gives you statistics about your cluster and
each MR job.
Alex
On Sun, Nov 9, 2008 at 11:33 PM, ZhiHong Fu [EMAIL PROTECTED] wrote:
Hello:
I have implemented a Map/Reduce job, which will
There has been a lot of discussion on this list about handling namenode
failover. Generally the most common approach is to backup the namenode to
an NFS mount and manually instantiate a new namenode when your current
namenode fails.
As Hadoop exists today, the namenode is a single point of
On 11/10/08 1:30 AM, Aaron Kimball [EMAIL PROTECTED] wrote:
It sounds like you think the 64- and 32-bit environments are effectively
interchangable. May I ask why are you using both? The 64bit environment
gives you access to more memory; do you see faster performance for the TT's
in 32-bit
On 11/10/08 6:18 AM, Brian MacKay [EMAIL PROTECTED] wrote:
I had a similar problem when I upgraded... not sure of details why, but
I had permissions problems trying to develop and run on windows out of
cygwin.
At Apachecon, we think we identified a case where someone forgot to copy
the
Hi,
To make a Hadoop/MapReduce available for developers to experiment
with, we are setting up a cluster with Hadoop/MapReduce and a dataset,
and providing instructions how developers can use streaming to submit
jobs from their own machines.
For purposes of explanation here, we can assume
On 11/10/08 12:21 PM, Rick Hangartner [EMAIL PROTECTED] wrote:
But is there a proper way to allow developers to specify a remote_username
they legitimately have access to on the cluster if it is not the same
as the local_username of the account on their own machine they are
using to submit
hi all,
I hava a data set stored in hbase, and I run a mapreduce program
to analyze. Now I want to know how many maps in a map task?
I want to use the number of the maps in my program. For
example. There are 100 maps in a map task, and I want to collect all
the values, and analyze these
Hello:
I have customized a DbRecordAndOpInputFormt which will retrieve
data from several web Services And the Data format is like the dataItem in
Database ResultSets.
And Now I have encountered a problem, I get right (key,value) in
DbRecordReader next() method, But In Mapper
it there a reducer in your program? or you need to output the result
in map-side?
2008/11/11 ma qiang [EMAIL PROTECTED]:
hi all,
I hava a data set stored in hbase, and I run a mapreduce program
to analyze. Now I want to know how many maps in a map task?
I want to use the number of the
yes,
It need further analyze in reducer.
On Tue, Nov 11, 2008 at 10:28 AM, Mice [EMAIL PROTECTED] wrote:
it there a reducer in your program? or you need to output the result
in map-side?
2008/11/11 ma qiang [EMAIL PROTECTED]:
hi all,
I hava a data set stored in hbase, and I run a
Hello,
I am using JobControl to run a sequence of jobs(Job_1,Job_2,..Job_n)
on after the other. Each job returns some information
e.g
key1 value1,value2
key2 value1,value2
and so on. This can be found in the outdir passed to the jar file.
Is there a way for Job_1 to return some data (which can be
In case we are starting namenode on a different host, the configuration
on all the cluster nodes will need to be updated before a cluster
restart. right?
-Original Message-
From: Alex Loddengaard [mailto:[EMAIL PROTECTED]
Sent: Tuesday, November 11, 2008 12:07 AM
To:
But when I run , It will throw the exception in DbRecordReader.next()
method, Although I have Logged in it, I can't still see anything, and don't
know where I shoud to check, who can help me where I can get the real
excution status, so I can where the error is ! Thansks!
Check the logs
Couple of things that one can do:
1. dfs.name.dir should have at least two locations, one on the local
disk and one on NFS. This means that all transactions are
synchronously logged into two places.
2. Create a virtual IP, say name.xx.com that points to the real
machine name of the machine on
27 matches
Mail list logo