Jim the Standing Bear standingbear@... writes:
Hi,
I am new to hadoop, and tried to setup a distributed hadoop system.
But when I tried to run the example job, it stack dumped with the
following exception:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: Unknown
protocol to
Can you please post your conf files
On Mon, Jan 7, 2013 at 8:11 PM, Prabu gipr...@gmail.com wrote:
Jim the Standing Bear standingbear@... writes:
Hi,
I am new to hadoop, and tried to setup a distributed hadoop system.
But when I tried to run the example job, it stack dumped with
You are pointing your JobClient to the Namenode. Check your
mapred.job.tracker address and make sure it points to the correct
JobTracker node.
Also 0.14 is very very old. Please use 1.* releases.
HTH,
+Vinod
On Mon, Jan 7, 2013 at 1:11 AM, Prabu gipr...@gmail.com wrote:
Jim the Standing Bear
Thanks for the replies.
Harsh J, hadoop classpath was exactly what I needed. Got it working now.
Cheers,
Krishna
On 6 January 2013 11:14, John Hancock jhancock1...@gmail.com wrote:
Krishna,
You should be able to take the command you are using to start the hadoop
job (hadoop jar ..) and
Any Suggestions?
Thanks and Regards,
Shyam
On Sun, Jan 6, 2013 at 3:00 PM, Hadoop Learner
hadooplearner1...@gmail.comwrote:
Hi All,
Working on a requirement of hadoop Job Monitoring. Requirement is to get
every Task attempt details of a Running Job. Details are following :
Task Attempt
Hi,
I have been trying to play around with the hadoop jar command in 0.23.5 and
hive 0.9.0 wanted to run a custom mapreduce job using:
hadoop jar jar main-class -libjars comma-separated list of files
-files comma-separated list of files
Both libjars and files have the same files specified. The
Hi,
I have been trying to play around with the hadoop jar command in 0.23.5 and
hive 0.9.0 wanted to run a custom mapreduce job using:
hadoop jar jar main-class -libjars comma-separated list of files
-files comma-separated list of files
Both libjars and files have the same files specified. The
What is the differences between the two?
It seems like MR job could be configured using one of the other (e.g, extends
MapReduceBase implements Mapper of extends Mapper)
Cheers
Oleg
@Harsh J
I was following an explanation for Gridmix and it runs it using the java
command but here I have ran it as hadoop jar with the jar file that is located
in the contrib/gridmix directory that is in 1.0.4
as you can see before I run Gridmix the bench folder in hdfs is non existent
but
Where do I get 64-bit libhadoop.so for 2.0.2-alpha? The one under
HADOOP_INSTALL_DIR/lib/native is 32-bit and get wrong ELF when running Hadoop
client on 64-bit platform.
Thanks,
Jane
I am a little new to the hadoop world. But based on my readings and
understanding thus far, there is not much functionality difference. Only
difference is new API allows you to implement push and pull mechanisms in
your map/reduce tasks as against push in old API. Mapper has been changed
to super
Hi James,
This should be fine, but just wanted to add that you will need to also make
the change on your other nodes within the cluster, so they know how to
contact the filesystem.
Regards,
Robert
On Sun, Jan 6, 2013 at 12:18 AM, Jagat Singh jagatsi...@gmail.com wrote:
Yes your data is safe.
Under normal operation, NN takes care of under-replicated blocks by itself.
A file with a replication factor set higher than the cluster's nodes
will also register its blocks as under-replicated. A common config
mistake here is the mapred.submit.replication, which is a default of
10 (useful for
Thanks Harsh, you're the first, as usual.
Currently, there are 291 active nodes spread in 11 racks. We have
rack-awareness enable for a year (work like a champ).
I ran fsck throughout HDFS again and noticed that majority of the
files that have under repl. blocks
are set to have 255 replicas but
Thank everybody for the responses.
James
On Mon, Jan 7, 2013 at 12:49 PM, Robert Molina rmol...@hortonworks.comwrote:
Hi James,
This should be fine, but just wanted to add that you will need to also
make the change on your other nodes within the cluster, so they know how to
contact the
It depends. What data is going into the table, and what keys will drive the
lookup?
Let's suppose that you have a single JSON file that has some reasonable number
of key/value tuples. You could easily load a Hashtable to associate the
integer keys with the values (which appear to be lists of
Hi
Thanks for the reply. So here is the intent.
I process some data and output of that processing is this set of json
documents outputting {key:[values]} (This is essentially a form of graph
where each entry is an edge)
Now.. I process a different set of data and the idea is to modify the
awesome.
thanks
On Mon, Jan 7, 2013 at 4:11 PM, John Lilley john.lil...@redpoint.netwrote:
Let’s call these “the graph” and “the changes”.
** **
Will both the graph and the changes fit into memory?
Yes - You do not have a Hadoop-scale problem. Just write some code using
Are you sure the balancer does anything? I have about 500 missing replicas and
60 Under-replicated blocks and when I start balancer it does not do anything.
The balancer outputs two lines
INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 0 over utilized nodes:
INFO
We would like to configure the equivalent of Fair Scheduler
userMaxJobsDefault = 1 (i.e. we would like to limit a user to a single job
in the cluster).
· By default the Capacity Scheduler allows multiple jobs from a
single user to run concurrently.
· From
Hi Oleg,
Mapreduce 0.20.* api has the support for both 0.19 api (which is
MapRed package in which your mapper should extend MRBase and implements
Mapper) and it has the new api as well (which is MapReduce packages where
you directly extend Mapper).
As there are significant
Hi Jamal,
Another simple approach if your data is too huge and cannot fit into
memory would be just to use the MultipleInputs mechanism.
Where your MR job will have two mappers one emitting the records
from the graph file and other from changes file.
Any how your reducer
Hi,
In Hadoop 1.0, I don't think this information is exposed. The
TaskInProgress is an internal class and hence cannot / should not be used
from client applications. The only way out seems to be to screen scrape the
information from the Jobtracker web UI.
If you can live with completed events,
From a user perspective, at a high level, the mapreduce package can be
thought of as having user facing client code that can be invoked, extended
etc as applicable from client programs.
The mapred package is to be treated as internal to the mapreduce system,
and shouldn't directly be used unless
You will incur the cost of map reduce across all nodes in your cluster anyways.
Not sure, you will get enough speed advantage.
HBase may help you get close to what you are looking for but that wont be map
reduce.
Thanks,
Abhishek
From: jamal sasha [mailto:jamalsha...@gmail.com]
Sent: Monday,
25 matches
Mail list logo