I believe it's this:
mapred.submit.replication
10
The replication level for submitted job files. This
should be around the square root of the number of nodes.
You can set it per job in the job specific conf and/or in mapred-site.xml.
Friso
On 19 mei 2011, at 03:42, Steve Cohen
I've filed a JIRA and patch for this issue:
https://issues.apache.org/jira/browse/HDFS-1958
Thanks for the feedback, all.
-Todd
On Wed, May 18, 2011 at 7:46 PM, Jonathan Disher wrote:
> On May 18, 2011, at 4:54 PM, Aaron Eng wrote:
>> Case in point, I noted a while ago that when you run the nam
On May 18, 2011, at 4:54 PM, Aaron Eng wrote:
> Case in point, I noted a while ago that when you run the namenode -format
> command, it only accepts a capital Y (or lower case, can't remember), and it
> fails silently if you give the wrong case. I didn't particularly care enough
> to fix it, ha
KFS
On May 18, 2011 7:03 PM, "Thanh Do" wrote:
> hi hdfs users,
>
> Is anybody aware of a system
> that is similar to HDFS, in the sense
> that it has single master architecture,
> and the master also keeps an operation log.
>
> Thanks,
>
> Thanh
Luster ? Though it doesn't do mapreduce
Sent from my iPhone
On May 18, 2011, at 7:03 PM, "Thanh Do" wrote:
> hi hdfs users,
>
> Is anybody aware of a system
> that is similar to HDFS, in the sense
> that it has single master architecture,
> and the master also keeps an operation log.
>
> Than
hi hdfs users,
Is anybody aware of a system
that is similar to HDFS, in the sense
that it has single master architecture,
and the master also keeps an operation log.
Thanks,
Thanh
Where is the default replication factor on job files set? Is it different then
the dfs.replication setting in hdfs-site.xml?
Sent from my iPad
On May 18, 2011, at 9:10 PM, Joey Echeverria wrote:
> Did you run a map reduce job?
>
> I think the default replication factor on job files is 10, whi
System: HDFS dirs across cluster as dataone, datatwo, datathree
I recently had an issue where I lost a slave that resulted in a large amount of
under replicated blocks.
The replication was quite slow on the uptake so I thought running a hadoop
balancer would help.
This seemed to exacerbate th
> If you have some specific issues you'd like to point out, please file
> JIRAs. I'll be sure to take a look.
>
If others would like to comment, see HDFS-1960: dfs.*.dir should not default
to /tmp (or other typically volatile storage).
I won't speak to other usability issues like NameNode format
Did you run a map reduce job?
I think the default replication factor on job files is 10, which
obviously doesn't work well on a psuedo-distributed cluster.
-Joey
On Wed, May 18, 2011 at 5:07 PM, Steve Cohen wrote:
> Thanks for the answer. Earlier, I asked about why I get occasional not
> repli
On Wed, May 18, 2011 at 4:55 PM, Aaron Eng wrote:
>>Most of the contributors are big picture types who would look at "small"
>> usability issues like this and scoff about "newbies".
> P.S. This is speaking from the newbie perspective, it was not meant as a
> slight to contributors in any way. Jus
Thanks for the answer. Earlier, I asked about why I get occasional not
replicated yet errors. Now, I had dfs.replication set to one. What replication
could it have been doing? Did the error messages actually mean that the file
couldn't get created in the cluster?
Thanks,
Steve Cohen
On May 1
>Most of the contributors are big picture types who would look at "small"
usability issues like this and scoff about "newbies".
P.S. This is speaking from the newbie perspective, it was not meant as a
slight to contributors in any way. Just a comment on the steep learning
curve of picking up Hadoo
Hey Tim,
Hope everything is good with you. Looks like you're having some fun with
hadoop.
>Can anyone enlighten me? Why is dfs.*.dir default to /tmp a good idea?
It's not a good idea, its just how it defaults. You'll find hundreds or
probably thousands of these quirks as you work with Apache/Cl
Can anyone enlighten me? Why is dfs.*.dir default to /tmp a good idea? I'd
rather, in order of preference, have the following behaviours if dfs.*.dir
are undefined:
1. Daemons log errors and fail to start at all,
2. Daemons start but default to /var/db/hadoop (or any persistent
location),
Tried to send this, but apparently SpamAssassin finds emails about
"replicas" to be spammy. This time with less rich text :)
On Wed, May 18, 2011 at 3:35 PM, Todd Lipcon wrote:
>
> Hi Steve,
> Running setrep will indeed change those files. Changing "dfs.replication"
> just changes the default re
Say I add a datanode to a pseudo cluster and I want to change the
replication factor to 2. I see that I can either run hadoop fs -setrep
or change the hdfs-site.xml value for dfs.replication. But do either
of these cause the existing blocks to replicate?
Thanks,
Steve Cohen
Hello,
We are running nutch in a hdfs distributed filesystem. on occasion, we
get errors like:
2011-05-18 02:05:42,132 WARN hdfs.DFSClient -
NotReplicatedYetException sleeping /cluster/hadoop/mapred/system/
job_201105171220_0255/job.jar retries left 4
2011-05-18 02:05:42,535 WARN hdfs.DFSClient
Hi.
We are using a cluster of 2 computers (1 namenode and 2 secondarynodes)
to store a large number of text files in the HDFS. The process had been
running for atleast a couple of weeks when suddenly due to some power
failure, the server got reset. So, in effect, the HDFS didn't stop
cleanly.
On Tue, May 17, 2011 at 06:22:00PM -0700, Time Less wrote:
> I now have a metaquestion: is there a default Hadoop configuration out there
> somewhere that has all critical parameters at least listed, if not filled out
> with some sane defaults? I keep discovering undefined parameters via unusual
>
20 matches
Mail list logo