Sorry about the formatting on that, I hit send before I'd checked it. Here
it is again, hopefully a bit more legibly (and with a fix):
> I implemented something similar last year to guarantee resource
provisioning when we deployed to YARN. We stuck to one-label-per-node to
keep things relatively
Heya,
I implemented something similar last year to guarantee resource
provisioning when we deployed to YARN. We stuck to one-label-per-node
to keep things relatively simple. Iirc, these are the basic steps:
- add `yarn.node-labels.configuration-type=centralized` to your yarn-site.xml
- set up
You can automate to create a capacity-scheduler.xml based on the
requirement, after that you can deploy it on RM, and refresh the queue.
Is it your requirement to not to restart RM or not to change capacity
scheduler?
On Thu, May 13, 2021 at 2:45 PM 慧波彭 wrote:
> Hello, we use capacity scheduler
Hello, we use capacity scheduler to allocate resources in our production
environment, and use node label to isolate resources.
There is a demand that we want to dynamically create node labels and
associate node labels to existing queue without
changing capacity-scheduler.xml.
Does anyone know how
Max file size is not configurable directly but other settings could affect max
file size, such as maximum number of blocks per file setting
dfs.namenode.fs-limits.max-blocks-per-file. This prevents the creation of
extremely large files which can degrade performance.
Hi,
I don't have much knowledge about Hadoop/HDFS, my question can be simple,
or not...
Then, I have a Hadoop/HDFS environment, but my disks are not very big.
One applicacion is writing in files. But, sometimes the disk is filled with
large file sizes.
Then, my question is:
Exist any form to
Hi Evelina!
You've posted the logs for the MapReduce ApplicationMaster . From this I
can see the reducer timed out after 600 secs :
2017-04-21 00:24:07,747 INFO [AsyncDispatcher event handler]
org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: Diagnostics
report from
The Hadoop version that I use is 2.7.1 and the Hbase version is 1.2.5.
I can do any operation from the HBase shell.
On Fri, Apr 21, 2017 at 8:01 AM, evelina dumitrescu <
evelina.a.dumitre...@gmail.com> wrote:
> Hi,
>
> I am new to Hadoop and Hbase.
> I was trying to make a small proof-of-concept
Hi,
I am new to Hadoop and Hbase.
I was trying to make a small proof-of-concept Hadoop map reduce job that
reads the data from HDFS and stores the output in Hbase.
I did the setup as presented in this tutorial [1].
Here is the pseudocode from the map reduce code [2].
The problem is that I am
Can you share the following info from your environment so that it can better
help us in helping you with this issue. Simply looking at Java exceptions may
not be enough
Hadoop versionJava version Memory allocations and heap size for node manager
and containers How the job was run, Hive,
Look like problems with the java executor or Countainer executor file
on the nodes.
I would recommend you to verify the java executor on all nodes
(/usr/bin/java). It is possible the links are missing.
Regards,
Gagan Brahmi
On Sat, Apr 2, 2016 at 9:40 AM, 169517388 <169517...@qq.com> wrote:
>
to hadoop.org:
Hello hadoop.org. I'm the new guy who is learning the hadoop right now. I'v
bulit a 5 nodes hadoop experimental environment. When I ran the MR program, the
error came out.
I searched a lot. Maybe the java runtime environment or anything else. I
didn't get it.
Hi,
Under hadoop-mapreduce-project directory, I notice the following two
directories:
1hadoop-mapreduce-client/
2src/
2 src/ can be expanded to
java/org/apache/hadoop/mapreduce
My question is what the directory is for. And I wonder why mapreduce code
is in hadoop-mapreduce-client.
Hi John,
FWIW, setting the log level of org.apache.hadoop.security.UserGroupInformation
to ERROR seemed to prevent the fatal NameNode slowdown we ran into. Although I
still saw no such user Shell$ExitCodeException messages in the logs, these
only occurred every few minutes or so. Thus, it
Hi John,
My AWS Elastic MapReduce NameNode is also filling its log file with messages
like the following:
2014-02-18 23:56:52,344 WARN org.apache.hadoop.security.UserGroupInformation
(IPC Server handler 78 on 9000): No groups available for user
job_201402182309_0073
2014-02-18 23:56:52,351
Subject: Re: question about hadoop dfs
1. Is supergroup a directory? Where does it locate?
supergroup is user group rather than directory just like the user
group of linux
2. I search abc.txt on master 172.11.12.6 and node1 172.11.12.7 by
following command:
the meadata(file name, file
I use Hadoop2.2.0 to create a master node and a sub node,like follows:
Live Datanodes : 2
Node Transferring Address Last Contact Admin State Configured Capacity (GB)
Used(GB) Non DFS Used (GB) Remaining(GB) Used(%)
master 172.11.12.6:50010 1In Service
It just seems like lazy code. You can see that, later, there is this:
{code}
for(Token? token : UserGroupInformation.getCurrentUser().getTokens())
{
childUGI.addToken(token);
}
{code}
So eventually the JobToken is getting added to the UGI which runs task-code.
Thanks Vinod for your quick response. It is running in non-secure mode.
I still don't get what is the purpose to use job id in UGI. Could you
please explain a bit more?
Thanks,
John
On Wed, Jan 8, 2014 at 10:11 AM, Vinod Kumar Vavilapalli
vino...@hortonworks.com wrote:
It just seems like
Looked a bit deeper and seems this code was introduced by the following
JIRA.
https://issues.apache.org/jira/browse/MAPREDUCE-1457
There is another related JIRA, i.e.,
https://issues.apache.org/jira/browse/MAPREDUCE-4329.
Perhaps, the warning message is a side effect of JIRA MAPREDUCE-1457 when
Hi,
I looked at Hadoop 1.X source code and found some logic that I could not
understand.
In the org.apache.hadoop.mapred.Child class, there were two UGIs defined as
follows.
UserGroupInformation current = UserGroupInformation.getCurrentUser();
current.addToken(jt);
hi,all
i have a question that i can not answer by myself,hope any one can help.
if i do not set up HA,client can query DNS get the hdfs entrance,but if i
set up namenode HA,how client know which host it should talk?
Client uses the class dfs.client.failover.proxy.provider.[nameservice ID]* *to
find the active namenode. Default is ConfiguredFailOverProxyProvider. You
can plug in your own implementation and specify it in the config file.
Regards,
Bharath .V
w:http://researchweb.iiit.ac.in/~bharath.v
On Fri,
Dear Sir
we are students at Hosei University.
we study hadoop now for reserch.
we use Hadoop2.0.0-CDH4.2.1 MRv2 and its environment is centOS 6.2.
we can access HDFS from master and slaves.
We have some questions.
Master:Hadoop04
Slaves:Hadoop01
Hadoop02
Hadoop03
we run the
Hi Tsuyoshi,
Did you run wordcount sample in hadoop-examples.jar?
Can you share the command that you run?
Thanks,
--
Tatsuo
On Tue, Aug 6, 2013 at 3:55 PM, 間々田 剛史
tsuyoshi.mamada...@stu.hosei.ac.jpwrote:
Dear Sir
we are students at Hosei University.
we study hadoop now for reserch.
we
Hi
You need to check your resourcemanager log and the container log which
container allocate by your RM.
发自我的 iPhone
在 2013-8-6,15:30,manish dunani manishd...@gmail.com 写道:
After checking ur error code.
I think u entered wrong map and reduce class.
can u pls show me code??
Then
Hi, all,
I have two simple questions about hadoop 2 (or YARN).
1. When will the stable version of hadoop 2.x come out?
2. We currently have a cluster deployed with hadoop 1.x, is there any way
to upgrade it to hadoop 2.x without damaging the existing data in current
HDFS?
Thank you very much!
Hi,
On Mon, Jul 22, 2013 at 7:26 AM, Yexi Jiang yexiji...@gmail.com wrote:
Hi, all,
I have two simple questions about hadoop 2 (or YARN).
1. When will the stable version of hadoop 2.x come out?
No fixed date yet, but probably later this year. There is a formal
beta that should release soon
Hi,
I am modifying the dependencies for Mahout package, (the open source machine
learning package built on top of Hadoop),
I am a bit confused over why there are so many hadoop dependencies in the maven
project, there are four artifactIds
1) hadoop-core, 2) hadoop-common,
Hi, jack
I also encounter this problem, so I would like to know how you deal with it.
Thanks
Fuzhen Zheng, from China
Hi,
I am using the latest Cloudera distribution, and with that I am able to use
the latest Hadoop API, which I believe is 0.21, for such things as
import org.apache.hadoop.mapreduce.Reducer;
So I am using mapreduce, not mapred, and everything works fine.
However, in a small streaming job,
I am sure if you ask at provider's specific list you'll get a better answer
than from common Hadoop list ;)
Cos
On Wed, Sep 14, 2011 at 09:48PM, Mark Kerzner wrote:
Hi,
I am using the latest Cloudera distribution, and with that I am able to use
the latest Hadoop API, which I believe is
I am sorry, you are right.
mark
On Wed, Sep 14, 2011 at 9:52 PM, Konstantin Boudnik c...@apache.org wrote:
I am sure if you ask at provider's specific list you'll get a better answer
than from common Hadoop list ;)
Cos
On Wed, Sep 14, 2011 at 09:48PM, Mark Kerzner wrote:
Hi,
I am
On 09/15/2011 08:18 AM, Mark Kerzner wrote:
Hi,
I am using the latest Cloudera distribution, and with that I am able to use
the latest Hadoop API, which I believe is 0.21, for such things as
import org.apache.hadoop.mapreduce.Reducer;
So I am using mapreduce, not mapred, and everything works
Thank you, Prashant, it seems so. I already verified this by refactoring the
code to use 0.20 API as well as 0.21 API in two different packages, and
streaming happily works with 0.20.
Mark
On Wed, Sep 14, 2011 at 11:46 PM, Prashant prashan...@imaginea.com wrote:
On 09/15/2011 08:18 AM, Mark
Hadoop embeds jetty directly into hadoop servers with the
org.apache.hadoop.http.HttpServer class for servlets. For jsp, web.xml
is auto generated with the jasper compiler during the build phase. The
new web framework for mapreduce 2.0 (MAPREDUCE-2399) wraps the hadoop
HttpServer and doesn't need
web.xml is in:
hadoop-releaseNo/webapps/job/WEB-INF/web.xml
Mark
On Thu, May 26, 2011 at 1:29 AM, Luke Lu l...@vicaya.com wrote:
Hadoop embeds jetty directly into hadoop servers with the
org.apache.hadoop.http.HttpServer class for servlets. For jsp, web.xml
is auto generated with the
Hi, Chen
How is it going recently?
Actually I think you misundertand the code in assignTasks() in
JobQueueTaskScheduler.java, see the following structure of the interesting
codes:
//I'm sorry, I hacked the code so much, the name of the variables may be
different from the original version
for
Hi,
The following is the setting of mapred-site where I have set the *
mapred.child.java.opts* to *-Xmx512 -Xincgc*, and *fs.checkpoint.size* to *
268435456*. But in the runtime setting job.xml, I found that it is still
using the default value *mapred.child.java.opts*= *-Xmx200, and the
Set them to final if you don't want the default values being applied.
A finaltrue/final addition should solve your problem (although it
may generate some warnings when your job tries to override them with
their defaults).
(Default value xml files are in the Hadoop jars and are usually picked
up
Hi Nan,
Thank you for the reply. I understand what you mean. What I concern is
inside the obtainNewLocalMapTask(...) method, it only assigns one tasks a
time.
Now I understand why it only assigns one task at a time. It is because the
outside loop:
for (i = 0; i MapperCapacity; ++i){
(..)
Hi, Chen
Actually not one task each time,
see this statement:
assignedTasks.add(t);
assignedTasks is the return value of this method, and it's a collection of
selected tasks, it will contain multiple tasks if the candidates are there..
Best,
Nan
On Tue, Jan 18, 2011 at 12:24 AM, He Chen
OK, I got your point,
you mean why don't we put the for loop into obtainNewLocalMapTask(),
yes, I think we can do that, but the result is the same with current codes,
and I don't think it will lead too many benefits on performance, and
personally, I like the current style, :-)
Best,
Nan
On
Hey all
Why does the FCFS scheduler only let a node chooses one task at a time in
one job? In order to increase the data locality,
it is reasonable to let a node to choose all its local tasks (if it can)
from a job at a time.
Any reply will be appreciated.
Thanks
Chen
Please refer to this website: http://www.apache.org/foundation/marks/
Sincerely yours,
Edward J. Yoon.
2010/9/11 PAL A.Suzuki asuz...@proxure.asia:
Dear Apache Hadoop Project team,
We are Software developer in Japan, a very small Start-up company.
We' d like to use your Hadoop(Yellow
Dear Apache Hadoop Project team,
We are Software developer in Japan, a very small Start-up company.
We' d like to use your Hadoop(Yellow elephant Word) Logo on the our Home Page
and some explanation materials (PPT
Word etc.)
Could you let us know the Logo Usage Guideline? or written URL?
hello,
I want to deploy Hadoop on a cluster. In this cluster, different nodes
share same file system. If I make changes to files on node1. then other
nodes will have the same changes. (The file system of this cluster is
perhaps called NFS ).
I don't know whether this cluster is fit for
If the HDFS uses the NFS to store files, then all I/O during the
execution of map and reduce tasks will use the NFS instead of the
local disks on each machine in the cluster (if they have one). This
can become a bottleneck if you have lots of tasks running
simultaneously.
However, even with the
Dechao bu,
Pay attention that running Hadoop on NFS you can get problems with locks.
And if you're looking for process large files, your network will probably be
a bottleneck.
--
Edson Ramiro Lucas Filho
http://www.inf.ufpr.br/erlf07/
On 12 May 2010 12:47, abhishek sharma absha...@usc.edu
Edson,
Pay attention that running Hadoop on NFS you can get problems with locks.
What locks are you referring to? I ran Hadoop on NFS and never ran
into any problems. I had a small cluster with 10 servers all connected
to the same switch.
And if you're looking for process large files, your
Abhishek,
I'm running Hadoop here and the cluster admin had mounted the NFS with
nolocks option.
So, I was getting No locks available message.
I said it just to pay attention in these kind of config ; )
--
Edson Ramiro Lucas Filho
http://www.inf.ufpr.br/erlf07/
On 12 May 2010 14:42,
How do I remove a datanode? Do I simply destroy my datanode and the namenode
will automatically detect it? Is there a more elegent way to do it?
Also, when I remove a datanode, does hadoop automatically re-replicate the data
right away?
Thanks,
Harold
It's in the FAQ:
http://wiki.apache.org/hadoop/FAQ#17
Brian
On Jun 4, 2009, at 6:26 PM, Harold Lim wrote:
How do I remove a datanode? Do I simply destroy my datanode and
the namenode will automatically detect it? Is there a more elegent
way to do it?
Also, when I remove a datanode,
hi:
What is the relationship between the hadoop and the amazon ec2 ?
Can hadoop run on the common pc (but not server ) directly ?
Why someone says hadoop run on the amazon ec2 ?
thanks!
--
View this message in context:
http://www.nabble.com/question-about-hadoop-and-amazon-ec2
:
http://www.nabble.com/question-about-hadoop-and-amazon-ec2---tp22020652p22020652.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.
--
Nitesh Bhatia
Dhirubhai Ambani Institute of Information Communication Technology
Gandhinagar
Gujarat
Life is never perfect. It just
Jason Rutherglen wrote:
I implemented an RMI protocol using Hadoop IPC and implemented basic
HMAC signing. It is I believe faster than public key private key
because it uses a secret key and does not require public key
provisioning like PKI would. Perhaps it would be a baseline way to
sign the
However, HDFS uses HTTP to serve blocks up -that needs to be locked down
too. Would the signing work there?
I am not familiar with HDFS over HTTP. Could it simply sign the
stream and include the signature at the end of the HTTP message
returned?
On Tue, Sep 30, 2008 at 8:56 AM, Steve
Owen O'Malley wrote:
On Sep 24, 2008, at 1:50 AM, Trinh Tuan Cuong wrote:
We are developing a project and we are intend to use Hadoop to handle
the processing vast amount of data. But to convince our customers
about the using of Hadoop in our project, we must show them the
advantages ( and
On Sep 24, 2008, at 1:50 AM, Trinh Tuan Cuong wrote:
We are developing a project and we are intend to use Hadoop to
handle the processing vast amount of data. But to convince our
customers about the using of Hadoop in our project, we must show
them the advantages ( and maybe ? the
: Re: Question about Hadoop 's Feature(s)
On Sep 24, 2008, at 1:50 AM, Trinh Tuan Cuong wrote:
We are developing a project and we are intend to use Hadoop to
handle the processing vast amount of data. But to convince our
customers about the using of Hadoop in our project, we must show
Thank you very much for explaining it to me, Ted.. Thats a great deal of
info!
I guess that could be how Yahoo Webmap is designed..
And for anyone trying to figure out the massiveness of Hadoop computing,
http://open.blogs.nytimes.com/2007/11/01/self-service-prorated-super-computing-fun/should
Usually hadoop programs are not used interactively since what they excel at
is batch operations on very large collections of data.
It is quite reasonable to store resulting data in hadoop and access those
results using hadoop. The cleanest way to do that is to have a presentation
layer web
Message
From: Chanchal James [EMAIL PROTECTED]
To: core-user@hadoop.apache.org
Sent: Thursday, June 12, 2008 9:42:46 AM
Subject: Question about Hadoop
Hi,
I have a question about Hadoop. I am a beginner and just testing Hadoop.
Would like to know how a php application would benefit from
Once it is in HDFS, you already have backups (due to the replicated file
system).
Your problems with deleting the dfs data directory are likely configuration
problems combined with versioning of the data store (done to avoid
confusion, but usually causes confusion). Once you get the
Thank you all for the responses.
So in order to run a web-based application, I just need to put the part of
the application that needs to make use of distributed computation in HDFS,
and have the other web site related files access it via Hadoop streaming ?
Is that how Hadoop is used ?
Sorry
upgrade 0.16.3 to 0.17, error appears when start dfs and jobtracker. How can
I do with it? Thanks!
I have use the “start-dfs.sh �Cupgrade” command to upgrade the filesystem
below is the error log:
2008-05-26 09:14:33,463 INFO org.apache.hadoop.mapred.JobTracker:
STARTUP_MSG:
On Sun, 23 Mar 2008, Chaman Singh Verma wrote:
Hello,
I am exploring Hadoop and MapReduce and I have one very simple question.
I have 500GB dataset on my local disk and I have written both Map-Reduce
functions. Now how should I start ?
1. I copy the data from local disk to DFS. I have
67 matches
Mail list logo