Hi,
Suppose I am having 10 windows machines and if I have 10 VM individual
instances running on these machines independently, can I use these VM
instances to communicate with each other so that I can make hadoop cluster
using those VM instances.
Did anyone tried that thing ?
I know we can setup
Hi,
Yes, it will work. HBase won't see the difference, it's a pure vmware stuff.
Obviously, it's not something you can do for production nor performance
analysis.
Cheers,
N.
On Wed, Sep 28, 2011 at 8:38 AM, praveenesh kumar praveen...@gmail.comwrote:
Hi,
Suppose I am having 10 windows
it's not something you can do for production nor performance
analysis.
Can you please tell me what does it mean ?
Why Can't we use this approach for production ???
Thanks
On Tue, Sep 27, 2011 at 11:56 PM, N Keywal nkey...@gmail.com wrote:
Hi,
Yes, it will work. HBase won't see the
For example:
- It's adding two layers (windows linux), that can both fail, especially
under heavy workload (and hadoop is built to use all the resources
available). They will need to be managed as well (software upgrades,
hardware support...), it's an extra cost.
- These two layers will use
On 28/09/11 04:19, Hamedani, Masoud wrote:
Special Thanks for your help Arko,
You mean in Hadoop, NameNode, DataNodes, JobTracker, TaskTrackers and all
the clusters should deployed on Linux machines???
We have lots of data (on windows OS) and code (written in C#) for data
mining, we wana to use
On 28/09/11 08:37, N Keywal wrote:
For example:
- It's adding two layers (windows linux), that can both fail, especially
under heavy workload (and hadoop is built to use all the resources
available). They will need to be managed as well (software upgrades,
hardware support...), it's an extra
Hi,
Is it possible to get the process id of each task in a MapReduce job?
When I run a mapreduce job and do a monitoring in linux using ps, i just see
the id of the mapreduce job process but not its constituent map/reduce
tasks.
The use case is to monitor the resource usage of each task by using
Hi hadoopers,
I was looking the way to dump hadoop configuration in order to check if what
i have just changed
in mapred-site.xml is really kicked in.
Found that HADOOP-6184
https://issues.apache.org/jira/browse/HADOOP-6184is exactly what i
want but the thing is I am running CDH3u0 which is
You could always check the web-ui job history for that particular run, open the
job.xml, and search for what the value of that parameter was at runtime.
Matt
-Original Message-
From: patrick sang [mailto:silvianhad...@gmail.com]
Sent: Wednesday, September 28, 2011 4:00 PM
To:
The xml configuration file is also available under hadoop logs on the
jobtracker.
Raj
From: GOEKE, MATTHEW (AG/1000) matthew.go...@monsanto.com
To: common-user@hadoop.apache.org common-user@hadoop.apache.org
Sent: Wednesday, September 28, 2011 2:27 PM
Subject:
Hi everyone,
I'm looking for some recommendations for how to get our Hadoop cluster to do
faster I/O.
Currently, our lab cluster is 8 worker nodes and 1 master node (with
NameNode and JobTracker).
Each worker node has:
- 48 GB RAM
- 16 processors (Intel Xeon E5630 @ 2.53 GHz)
- 1 Gb Ethernet
Is it possible to submit a series of MR Jobs to the JobTracker to run in
sequence (one finishes, take the output of that if successful and feed it into
the next, etc), or does it need to run client side by using the JobControl or
something like Oozie, or rolling our own? What I'm looking for is
Hi, all
I met a problem when compling the hadoop-0.20.3
I modified some codes in JobTracker, and then compile it, the eclipse tells
me that
/Users/zhunan/codes/hadoop-0.20.203.0/build/src/org/apache/hadoop/mapred/jobfailures_jsp.java:13
org.apache.hadoop.mapred.JSPUtil does't exist
and I just
Can't this be done with a simple shell script?
Raj
From: Aaron Baff aaron.b...@telescope.tv
To: common-user@hadoop.apache.org common-user@hadoop.apache.org
Sent: Wednesday, September 28, 2011 4:56 PM
Subject: Running multiple MR Job's in sequence
Is it
The process ids of each individual task can be seen using jps and jconsole
commands provided by java.
jconsole command on command-line interface provides a GUI screen for monitoring
running tasks within java.
The tasks are only visible as java virtual machine instance in the os system
Hi,
The way I did it is have multiple JobConfs and running them one after the
another in the program as per the logic.
The setOutputPath in the previous job can be setInputPath in the next one if
you want to take the output from the previous job and feed it as input to
the next.
Thanks regards
Dear Steve,
thanks for your useful comments, I completely agree with your idea,
personally its more than 10 years that im only using Fedora, java, java
related techs, and open source software in all of my projects,
but this is a critical situation, all of current data and apps in our univ's
lab
Within the Hadoop core project, there is JobControl you can utilize
for this. You can view its API at
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/jobcontrol/package-summary.html
and it is fairly simple to use (Create jobs in regular java API, build
a dependency flow
Hello Bikash,
The tasks run on the tasktracker, so that is where you'll need to look
for the process ID -- not the JobTracker/client.
Crudely speaking,
$ ssh tasktracker01 # or whichever.
$ jps | grep Child | cut -d -f 1
# And lo, PIDs to play with.
On Thu, Sep 29, 2011 at 12:15 AM, bikash
hello!
I am also a new hadoop user.
And I meet the same problem with you.
I don't know how to solve the invalid file name of rumen.
my error is said WARN rumen.TraceBuilder: File skipped: Invalid file name:
job_201109221644_0001_username
If you have resolved your problem, can you help me? :)
my
20 matches
Mail list logo