Re: new to hadoop and first question

2013-01-07 Thread Prabu
Jim the Standing Bear standingbear@... writes: Hi, I am new to hadoop, and tried to setup a distributed hadoop system. But when I tried to run the example job, it stack dumped with the following exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: Unknown protocol to

Re: new to hadoop and first question

2013-01-07 Thread Jagat Singh
Can you please post your conf files On Mon, Jan 7, 2013 at 8:11 PM, Prabu gipr...@gmail.com wrote: Jim the Standing Bear standingbear@... writes: Hi, I am new to hadoop, and tried to setup a distributed hadoop system. But when I tried to run the example job, it stack dumped with

Re: new to hadoop and first question

2013-01-07 Thread Vinod Kumar Vavilapalli
You are pointing your JobClient to the Namenode. Check your mapred.job.tracker address and make sure it points to the correct JobTracker node. Also 0.14 is very very old. Please use 1.* releases. HTH, +Vinod On Mon, Jan 7, 2013 at 1:11 AM, Prabu gipr...@gmail.com wrote: Jim the Standing Bear

Re: Possible to run an application jar as a hadoop daemon?

2013-01-07 Thread Krishna Rao
Thanks for the replies. Harsh J, hadoop classpath was exactly what I needed. Got it working now. Cheers, Krishna On 6 January 2013 11:14, John Hancock jhancock1...@gmail.com wrote: Krishna, You should be able to take the command you are using to start the hadoop job (hadoop jar ..) and

Re: Reg: Fetching TaskAttempt Details from a RunningJob

2013-01-07 Thread Hadoop Learner
Any Suggestions? Thanks and Regards, Shyam On Sun, Jan 6, 2013 at 3:00 PM, Hadoop Learner hadooplearner1...@gmail.comwrote: Hi All, Working on a requirement of hadoop Job Monitoring. Requirement is to get every Task attempt details of a Running Job. Details are following : Task Attempt

hadoop 0.23.5 -files and -libjars

2013-01-07 Thread Viral Bajaria
Hi, I have been trying to play around with the hadoop jar command in 0.23.5 and hive 0.9.0 wanted to run a custom mapreduce job using: hadoop jar jar main-class -libjars comma-separated list of files -files comma-separated list of files Both libjars and files have the same files specified. The

hadoop 0.23.5 -files and -libjars

2013-01-07 Thread Viral Bajaria
Hi, I have been trying to play around with the hadoop jar command in 0.23.5 and hive 0.9.0 wanted to run a custom mapreduce job using: hadoop jar jar main-class -libjars comma-separated list of files -files comma-separated list of files Both libjars and files have the same files specified. The

Differences between 'mapped' and 'mapreduce' packages

2013-01-07 Thread Oleg Zhurakousky
What is the differences between the two? It seems like MR job could be configured using one of the other (e.g, extends MapReduceBase implements Mapper of extends Mapper) Cheers Oleg

Re: Gridmix version 1.0.4 Error

2013-01-07 Thread Sean Barry
@Harsh J I was following an explanation for Gridmix and it runs it using the java command but here I have ran it as hadoop jar with the jar file that is located in the contrib/gridmix directory that is in 1.0.4 as you can see before I run Gridmix the bench folder in hdfs is non existent but

64-bit libhadoop.so for 2.0.2-alpha?

2013-01-07 Thread Jane Chen
Where do I get 64-bit libhadoop.so for 2.0.2-alpha? The one under HADOOP_INSTALL_DIR/lib/native is 32-bit and get wrong ELF when running Hadoop client on 64-bit platform. Thanks, Jane

Re: Differences between 'mapped' and 'mapreduce' packages

2013-01-07 Thread Sandeep Dukkipati
I am a little new to the hadoop world. But based on my readings and understanding thus far, there is not much functionality difference. Only difference is new API allows you to implement push and pull mechanisms in your map/reduce tasks as against push in old API. Mapper has been changed to super

Re: Can I switch the IP/host of NN without losing the filesystem?

2013-01-07 Thread Robert Molina
Hi James, This should be fine, but just wanted to add that you will need to also make the change on your other nodes within the cluster, so they know how to contact the filesystem. Regards, Robert On Sun, Jan 6, 2013 at 12:18 AM, Jagat Singh jagatsi...@gmail.com wrote: Yes your data is safe.

Re: balancer and under replication

2013-01-07 Thread Harsh J
Under normal operation, NN takes care of under-replicated blocks by itself. A file with a replication factor set higher than the cluster's nodes will also register its blocks as under-replicated. A common config mistake here is the mapred.submit.replication, which is a default of 10 (useful for

Re: balancer and under replication

2013-01-07 Thread Patai Sangbutsarakum
Thanks Harsh, you're the first, as usual. Currently, there are 291 active nodes spread in 11 racks. We have rack-awareness enable for a year (work like a champ). I ran fsck throughout HDFS again and noticed that majority of the files that have under repl. blocks are set to have 255 replicas but

Re: Can I switch the IP/host of NN without losing the filesystem?

2013-01-07 Thread Jianhui Zhang
Thank everybody for the responses. James On Mon, Jan 7, 2013 at 12:49 PM, Robert Molina rmol...@hortonworks.comwrote: Hi James, This should be fine, but just wanted to add that you will need to also make the change on your other nodes within the cluster, so they know how to contact the

RE: Binary Search in map reduce

2013-01-07 Thread John Lilley
It depends. What data is going into the table, and what keys will drive the lookup? Let's suppose that you have a single JSON file that has some reasonable number of key/value tuples. You could easily load a Hashtable to associate the integer keys with the values (which appear to be lists of

Re: Binary Search in map reduce

2013-01-07 Thread jamal sasha
Hi Thanks for the reply. So here is the intent. I process some data and output of that processing is this set of json documents outputting {key:[values]} (This is essentially a form of graph where each entry is an edge) Now.. I process a different set of data and the idea is to modify the

Re: Binary Search in map reduce

2013-01-07 Thread jamal sasha
awesome. thanks On Mon, Jan 7, 2013 at 4:11 PM, John Lilley john.lil...@redpoint.netwrote: Let’s call these “the graph” and “the changes”. ** ** Will both the graph and the changes fit into memory? Yes - You do not have a Hadoop-scale problem. Just write some code using

Re: balancer and under replication

2013-01-07 Thread alxsss
Are you sure the balancer does anything? I have about 500 missing replicas and 60 Under-replicated blocks and when I start balancer it does not do anything. The balancer outputs two lines INFO org.apache.hadoop.hdfs.server.balancer.Balancer: 0 over utilized nodes: INFO

Re: Capacity Scheduler questions

2013-01-07 Thread Vinod Kumar Vavilapalli
We would like to configure the equivalent of Fair Scheduler userMaxJobsDefault = 1 (i.e. we would like to limit a user to a single job in the cluster). · By default the Capacity Scheduler allows multiple jobs from a single user to run concurrently. · From

Re: Differences between 'mapped' and 'mapreduce' packages

2013-01-07 Thread Mahesh Balija
Hi Oleg, Mapreduce 0.20.* api has the support for both 0.19 api (which is MapRed package in which your mapper should extend MRBase and implements Mapper) and it has the new api as well (which is MapReduce packages where you directly extend Mapper). As there are significant

Re: Binary Search in map reduce

2013-01-07 Thread Mahesh Balija
Hi Jamal, Another simple approach if your data is too huge and cannot fit into memory would be just to use the MultipleInputs mechanism. Where your MR job will have two mappers one emitting the records from the graph file and other from changes file. Any how your reducer

Re: Reg: Fetching TaskAttempt Details from a RunningJob

2013-01-07 Thread Hemanth Yamijala
Hi, In Hadoop 1.0, I don't think this information is exposed. The TaskInProgress is an internal class and hence cannot / should not be used from client applications. The only way out seems to be to screen scrape the information from the Jobtracker web UI. If you can live with completed events,

Re: Differences between 'mapped' and 'mapreduce' packages

2013-01-07 Thread Hemanth Yamijala
From a user perspective, at a high level, the mapreduce package can be thought of as having user facing client code that can be invoked, extended etc as applicable from client programs. The mapred package is to be treated as internal to the mapreduce system, and shouldn't directly be used unless

RE: Binary Search in map reduce

2013-01-07 Thread Pamecha, Abhishek
You will incur the cost of map reduce across all nodes in your cluster anyways. Not sure, you will get enough speed advantage. HBase may help you get close to what you are looking for but that wont be map reduce. Thanks, Abhishek From: jamal sasha [mailto:jamalsha...@gmail.com] Sent: Monday,